• Docs >
  • TorchServe GenAI use cases and showcase
Shortcuts

TorchServe GenAI use cases and showcase

This document shows interesting usecases with TorchServe for Gen AI deployments.

Enhancing LLM Serving with Torch Compiled RAG on AWS Graviton

In this blog, we show how to deploy a RAG Endpoint using TorchServe, increase throughput using torch.compile and improve the response generated by the Llama Endpoint. We also show how the RAG endpoint can be deployed on CPU using AWS Graviton, while the Llama endpoint is still deployed on a GPU. This kind of microservices-based RAG solution efficiently utilizes compute resources, resulting in potential cost savings for customers.

Multi-Image Generation Streamlit App: Chaining Llama & Stable Diffusion using TorchServe, torch.compile & OpenVINO

This Multi-Image Generation Streamlit app is designed to generate multiple images based on a provided text prompt. Instead of using Stable Diffusion directly, this app chains Llama and Stable Diffusion to enhance the image generation process. This multi-image generation use case exemplifies the powerful synergy of cutting-edge AI technologies: TorchServe, OpenVINO, Torch.compile, Meta-Llama, and Stable Diffusion.

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources