vLLM – PyTorch

Blog

vLLM and PyTorch Work Together to Improve the Developer Experience on aarch64

TLDR: PyTorch 2.11 makes it possible to install CUDA-enabled PyTorch wheels on aarch64 Linux directly…

Kaichao You (Inferact)May 18, 2026

Blog

SMG: The Case for Disaggregating CPU from GPU in LLM Serving

How It Started: Hitting the GIL Wall at Scale We've been running production model serving…

Simo Lin, Chang Su, and Keyang Ru, members of LightSeek FoundationApril 30, 2026

Blog Case Studies

IBM Research uses vLLM at the heart of its RITS Platform

TL;DR: vLLM has been critical to democratizing access to our research community to the latest…

PyTorch FoundationApril 24, 2026