Portable Paged Attention in Helion Blog Portable Paged Attention in Helion Recently, the PyTorch team released Helion, a new domain-specific and PyTorch-based language to make the…Burkhard Ringlein (IBM Research) and the vLLM Team at IBM ResearchFebruary 3, 2026
Unlock Reasoning in Llama 3.1-8B via Full Fine-Tuning on NVIDIA DGX Spark BlogCommunity Unlock Reasoning in Llama 3.1-8B via Full Fine-Tuning on NVIDIA DGX Spark What is the unsaid joy of local LLMs? The magic of downloading weights, running some…Sanyam Bhutani (PyTorch Meta), Hamid Shojanazeri (PyTorch Meta), Clement Anthonioz Blanc (Meta)February 2, 2026
Accelerating On-Device ML Inference with ExecuTorch and Arm SME2 Blog Accelerating On-Device ML Inference with ExecuTorch and Arm SME2 Interactive image segmentation has become a defining mobile experience across the world’s most popular apps.…Jason Zhu, Tyler Mullenbach, Damien Dooley, and Gian Marco Idoice, ArmJanuary 29, 2026
PyTorch 2.10 Release Blog Blog PyTorch 2.10 Release Blog We are excited to announce the release of PyTorch® 2.10 (release notes)! This release features…PyTorch FoundationJanuary 21, 2026
PyTorch Foundation in 2025: A Year in Review and the Road Ahead AnnouncementsBlog PyTorch Foundation in 2025: A Year in Review and the Road Ahead 2025 was a defining year for PyTorch Foundation. In May, we announced our expansion into…PyTorch FoundationJanuary 15, 2026
Supercharging LLMs: Scalable RL with torchforge and Weaver Blog Supercharging LLMs: Scalable RL with torchforge and Weaver Scaling reinforcement learning (RL) for post-training large language models (LLMs) is notoriously difficult. While running…Stanford - Jon Saad-Falcon, Hangoo Kang, Simon Guo, Aakanksha Chowdhery, Azalia Mirhoseini Meta - Allen Wang, Danning Xie, Evan Smothers, Felipe Mello, Jack Khuu, Jiyue Wang, Joe Cummings, Lucas Pasqualin, Philip Bontrager, Rithesh Baradi, Vidhya Venkat, Yuxuan Hu, Jafar Taghiyar, Davide Italiano, Gayathri Aiyer, John Myles White, Joe Spisak, Sanyam Bhutani, Hamid Shojanazeri, Matthias Reso Ali Sol Hossein Kavianihamedani Emre Guven CoreWeave - Deok Filho Aaron Batilo Matthew Guan Xi LuJanuary 9, 2026
Warp Specialization in Triton: Design and Roadmap Blog Warp Specialization in Triton: Design and Roadmap The Triton compiler aims to generate performance-portable code and runtime across hardware for AI kernels.…Manman Ren, Nick Riasanovsky, Neil Dhar, Hongtao Yu, Jie Liu, Partha Kanuparthy, Shane NayJanuary 8, 2026
PyTorch 2.9: FlexAttention Optimization Practice on Intel GPUs Blog PyTorch 2.9: FlexAttention Optimization Practice on Intel GPUs Overview The most recent LLM serving frameworks and models increasingly adopt attention variants, such as…Intel PyTorch and Triton teamJanuary 8, 2026
Deploying Smarter: Hardware-Software Co-design in PyTorch Blog Deploying Smarter: Hardware-Software Co-design in PyTorch If you want powerful on-device AI that doesn’t blow your memory budget or turn your…Kieran Hejmadi, ArmDecember 18, 2025
Enabling Cluster Launch Control with TLX Blog Enabling Cluster Launch Control with TLX What is cluster launch control (CLC)? Blackwell brings in cluster launch control (CLC) to enable…Daohang Shi, Hongtao Yu, Manman RenDecember 17, 2025
PyTorch Ecosystem Working Group Update & Project Spotlights, Q4 2025 AnnouncementsBlog PyTorch Ecosystem Working Group Update & Project Spotlights, Q4 2025 As part of this blog series, we share updates on new projects joining the PyTorch…PyTorch Ecosystem Working GroupDecember 17, 2025
PyTorch Foundation at NeurIPS 2025: PyTorch Community Highlights, Sessions, & Takeaways AnnouncementsBlog PyTorch Foundation at NeurIPS 2025: PyTorch Community Highlights, Sessions, & Takeaways NeurIPS 2025 brought together researchers, engineers, maintainers, and ecosystem contributors from across the AI community.…PyTorch FoundationDecember 16, 2025
Hybrid Models Meet SGLang: More than Full Attention BlogCommunity Hybrid Models Meet SGLang: More than Full Attention Introduction Hybrid models that combine the capabilities of full attention layers with alternatives—such as Mamba…SGLang TeamDecember 3, 2025
Efficient MoE Pre-training at Scale on 1K AMD GPUs with TorchTitan Blog Efficient MoE Pre-training at Scale on 1K AMD GPUs with TorchTitan Training massive Mixture-of-Experts (MoE) models like DeepSeek-V3 and Llama 4-Scout efficiently is one of the…AMD Contributors: Liz Li, Yanyuan Qin, Yuankai Chen, Xinyu Kang, Xiaobo Chen, Zhen Huang, Shekhar Pandey, Zhenyu Gu, Andy Luo, Meta Contributors: Matthias Reso, Hamid Shojanazeri, Tianyu Liu, Jiani Wang, Howard Huang, Wei Feng, Special Thanks: Guru MP, Yao Fu, Nick Ni, Emad Barsoum, Ramine Roane, and the TensorWave team for providing MI325 clusterDecember 1, 2025
The Future of Inference: PyTorch ATX Event BlogCommunity The Future of Inference: PyTorch ATX Event On September 17, 2025, PyTorch ATX partnered with the vLLM community and Red Hat to…Jason Meaux, ATX PyTorch leader and Stephen Watt, PyTorch Ambassador, Red HatNovember 26, 2025
OpenReg: A Self-Contained PyTorch Accelerator Simulator Blog OpenReg: A Self-Contained PyTorch Accelerator Simulator Introduction The PyTorch community is actively working to build a growing ecosystem of specialized accelerators…Jiahao Chen (Huawei) & Jiawei Li (Huawei) & Zesheng Zong (Huawei)November 21, 2025
PINA Joins the PyTorch Ecosystem: A Unified Framework for Scientific Machine Learning AnnouncementsBlog PINA Joins the PyTorch Ecosystem: A Unified Framework for Scientific Machine Learning Scientific Machine Learning (SciML) is reshaping how complex physical and scientific systems are modelled and…Giovanni Canali, Dario Coscia, Nicola Demo, Filippo Olivo; PINA Team.November 18, 2025
Beyond Quantization: Bringing Sparse Inference to PyTorch BlogCommunity Beyond Quantization: Bringing Sparse Inference to PyTorch As developers, we all know the story: Large Language Models (LLMs) are revolutionary, but their…Kira Selby & Varun Khare (NimbleEdge)November 13, 2025
Accelerating the Future of Open Source AI: PyTorch Conference 2025 Recap AnnouncementsBlog Accelerating the Future of Open Source AI: PyTorch Conference 2025 Recap PyTorch Conference 2025 brought together 3,432 developers, researchers, and innovators from 1,026 organizations across the…PyTorch FoundationNovember 12, 2025
KernelFalcon: Autonomous GPU Kernel Generation via Deep Agents Blog KernelFalcon: Autonomous GPU Kernel Generation via Deep Agents Summary We introduce KernelFalcon, a deep agent architecture for generating GPU kernels that combines hierarchical…Laura Wang and the PyTorch Team at MetaNovember 5, 2025