**********
TorchServe
**********

..
   image:: Pytorch_logo.png

TorchServe is a performant, flexible and easy to use tool for serving PyTorch models in production.

What's going on in TorchServe?

* `High performance Llama 2 deployments with AWS Inferentia2 using TorchServe <https://pytorch.org/blog/high-performance-llama/>`__
* `Naver Case Study: Transition From High-Cost GPUs to Intel CPUs and oneAPI powered Software with performance <https://pytorch.org/blog/ml-model-server-resource-saving/>`__
* `Run multiple generative AI models on GPU using Amazon SageMaker multi-model endpoints with TorchServe and save up to 75% in inference costs <https://aws.amazon.com/blogs/machine-learning/run-multiple-generative-ai-models-on-gpu-using-amazon-sagemaker-multi-model-endpoints-with-torchserve-and-save-up-to-75-in-inference-costs/>`__
* `Deploying your Generative AI model in only four steps with Vertex AI and PyTorch <https://cloud.google.com/blog/products/ai-machine-learning/get-your-genai-model-going-in-four-easy-steps>`__
* `PyTorch Model Serving on Google Cloud TPUv5 <https://cloud.google.com/tpu/docs/v5e-inference#pytorch-model-inference-and-serving>`__
* `Monitoring using Datadog <https://www.datadoghq.com/blog/ai-integrations/#model-serving-and-deployment-vertex-ai-amazon-sagemaker-torchserve>`__
* `Torchserve Performance Tuning, Animated Drawings Case-Study <https://pytorch.org/blog/torchserve-performance-tuning/>`__
* `Walmart Search: Serving Models at a Scale on TorchServe <https://medium.com/walmartglobaltech/search-model-serving-using-pytorch-and-torchserve-6caf9d1c5f4d>`__
* `Scaling inference on CPU with TorchServe <https://www.youtube.com/watch?v=066_Jd6cwZg>`__
* `TorchServe C++ backend <https://www.youtube.com/watch?v=OSmGGDpaesc>`__
* `Grokking Intel CPU PyTorch performance from first principles: a TorchServe case study <https://pytorch.org/tutorials/intermediate/torchserve_with_ipex.html>`__
* `Grokking Intel CPU PyTorch performance from first principles( Part 2): a TorchServe case study <https://pytorch.org/tutorials/intermediate/torchserve_with_ipex_2.html>`__
* `Case Study: Amazon Ads Uses PyTorch and AWS Inferentia to Scale Models for Ads Processing <https://pytorch.org/blog/amazon-ads-case-study/>`__
* `Optimize your inference jobs using dynamic batch inference with TorchServe on Amazon SageMaker <https://aws.amazon.com/blogs/machine-learning/optimize-your-inference-jobs-using-dynamic-batch-inference-with-torchserve-on-amazon-sagemaker/>`__
* `Using AI to bring children's drawings to life <https://ai.facebook.com/blog/using-ai-to-bring-childrens-drawings-to-life/>`__
* `Model Serving in PyTorch <https://www.youtube.com/watch?v=2A17ZtycsPw>`__
* `Evolution of Cresta's machine learning architecture: Migration to AWS and PyTorch <https://aws.amazon.com/blogs/machine-learning/evolution-of-crestas-machine-learning-architecture-migration-to-aws-and-pytorch/>`__
* `Explain Like I’m 5: TorchServe <https://www.youtube.com/watch?v=NEdZbkfHQCk>`__
* `How to Serve PyTorch Models with TorchServe <https://www.youtube.com/watch?v=XlO7iQMV3Ik>`__
* `How to deploy PyTorch models on Vertex AI <https://cloud.google.com/blog/topics/developers-practitioners/pytorch-google-cloud-how-deploy-pytorch-models-vertex-ai>`__
* `Quantitative Comparison of Serving Platforms <https://biano-ai.github.io/research/2021/08/16/quantitative-comparison-of-serving-platforms-for-neural-networks.html>`__


.. customcardstart::

.. customcarditem::
   :header: TorchServe Quick Start
   :card_description: Learn how to install TorchServe and serve models.
   :image: https://user-images.githubusercontent.com/880376/83180095-c44cc600-a0d7-11ea-97c1-23abb4cdbe4d.jpg
   :link: getting_started.html
   :tags: Quick Start

.. customcarditem::
   :header: Running TorchServe
   :card_description: Indepth explanation of how to run TorchServe
   :image: https://raw.githubusercontent.com/pytorch/serve/master/docs/images/dogs-after.jpg
   :link: server.html
   :tags: Running TorchServe

.. customcarditem::
   :header: Why TorchServe
   :card_description: Various TorchServe use cases
   :image: https://download.pytorch.org/torchaudio/tutorial-assets/thumbnails/streamreader_basic_tutorial.png
   :link: use_cases.html
   :tags: Examples

.. customcarditem::
   :header: Performance
   :card_description: Guides and best practices on how to improve perfromance when working with TorchServe
   :image: https://raw.githubusercontent.com/pytorch/serve/master/benchmarks/predict_latency.png
   :link: performance_guide.html
   :tags: Performance,Troubleshooting

.. customcarditem::
   :header: Metrics
   :card_description: Collecting and viewing Torcherve metrics
   :image: https://user-images.githubusercontent.com/5276346/234725829-7f60e0d8-c76d-4019-ac8f-7d60069c4e58.png
   :link: metrics.html
   :tags: Metrics,Performance,Troubleshooting


.. customcarditem::
   :header: Large Model Inference
   :card_description: Serving Large Models with TorchServe
   :image: https://raw.githubusercontent.com/pytorch/serve/master/docs/images/ts-lmi-internal.png
   :link: large_model_inference.html
   :tags: Large-Models,Performance

.. customcarditem::
   :header: Troubleshooting
   :card_description: Various updates on Torcherve and use cases.
   :image: https://raw.githubusercontent.com/pytorch/serve/master/benchmarks/snake_viz.png
   :link: Troubleshooting.html
   :tags: Troubleshooting,Performance

.. customcarditem::
   :header: TorchServe Security Policy
   :card_description: Security Policy
   :image: https://user-images.githubusercontent.com/880376/83180095-c44cc600-a0d7-11ea-97c1-23abb4cdbe4d.jpg
   :link: security.html
   :tags: Security 

.. customcarditem::
   :header: FAQs
   :card_description: Various frequently asked questions.
   :image: https://raw.githubusercontent.com/pytorch/serve/master/docs/images/NMTDualTranslate.png
   :link: FAQs.html
   :tags: FAQS


.. customcardend::