FlexAttention Part II: FlexAttention for Inference Blog FlexAttention Part II: FlexAttention for Inference Overview In PyTorch 2.5.0 release, we introduced FlexAttention torch.nn.attention.flex_attention for ML researchers who’d like to…Joy Dong, Boyuan Feng, Driss Guessous, Joel Schlosser, Yanbo Liang, Horace HeApril 30, 2025
6x faster Async Checkpointing in PyTorch, using Cached Plans, no GIL contention Blog 6x faster Async Checkpointing in PyTorch, using Cached Plans, no GIL contention Meta: Less Wright, Meet Vadakkanchery, Saurabh Mishra, Ela Krepska, Hamid Shojanazeri, Pradeep Fernando Crusoe: Ethan…Meta and CrusoeApril 30, 2025
Accelerating Large Scale Training and Convergence with PyTorch Float8 Rowwise on Crusoe 2K H200s Blog Accelerating Large Scale Training and Convergence with PyTorch Float8 Rowwise on Crusoe 2K H200s Meta: Less Wright, Hamid Shojanazeri, Vasiliy Kuznetsov, Daniel Vega-Myhre, Gokul Nadathur, Will Constable, Tianyu Liu,…Meta and CrusoeApril 28, 2025
Accelerate PyTorch 2.7 on Intel® GPUs Blog Accelerate PyTorch 2.7 on Intel® GPUs PyTorch 2.7 continues to deliver significant functionality and performance enhancements on Intel® GPU architectures to streamline…Intel PyTorch TeamApril 25, 2025
PyTorch 2.7 Release Blog PyTorch 2.7 Release We are excited to announce the release of PyTorch® 2.7 (release notes)! This release features:…PyTorch TeamApril 23, 2025
Accelerating Whisper on Arm with PyTorch and Hugging Face Transformers Blog Accelerating Whisper on Arm with PyTorch and Hugging Face Transformers Automatic speech recognition (ASR) has revolutionized how we interact with technology, clearing the way for…Pareena Verma, ArmApril 8, 2025
SGLang Joins PyTorch Ecosystem: Efficient LLM Serving Engine BlogEcosystem SGLang Joins PyTorch Ecosystem: Efficient LLM Serving Engine We’re thrilled to announce that the SGLang project has been integrated into the PyTorch ecosystem!…SGLang TeamMarch 19, 2025
PyTorch at GTC 2025 BlogCommunity PyTorch at GTC 2025 GTC is coming back to San Jose on March 17–21, 2025. Join PyTorch Foundation members Arm,…PyTorch FoundationMarch 16, 2025
Scaling Recommendation Systems Training to Thousands of GPUs with 2D Sparse Parallelism Blog Scaling Recommendation Systems Training to Thousands of GPUs with 2D Sparse Parallelism At Meta, recommendation systems are the cornerstone of delivering relevant and personalized ads to billions…PyTorch Team at Meta: Chunzhi Yang, Rich Zhu, Zain Huda, Liangbei Xu, Xin Zhang, Jiyan Yang, Dennis van der Staay, Wang Zhou, Jin Fang, Jade Nie, Yuxi HuMarch 11, 2025
Powering AI with PyTorch, Fedora, and Open Source Communities BlogCommunity Powering AI with PyTorch, Fedora, and Open Source Communities At DevConf.IN 2025 in Pune, I had the opportunity to host a PyTorch Meetup on February 28th. The session,…Sudhir DharanendraiahMarch 7, 2025
Peak Performance, Minimized Memory: Optimizing torchtune’s performance with torch.compile & Liger Kernel Blog Peak Performance, Minimized Memory: Optimizing torchtune’s performance with torch.compile & Liger Kernel LinkedIn: Shivam Sahni, Byron Hsu, Yanning ChenMeta: Ankith Gunapal, Evan Smothers This blog explores the…LinkedIn and MetaMarch 6, 2025
Current and New Activation Checkpointing Techniques in PyTorch Blog Current and New Activation Checkpointing Techniques in PyTorch As models scale in depth, batch size, and sequence length, etc, activation memory becomes an…PyTorch FoundationMarch 5, 2025
Accelerating Generative AI with PyTorch: Segment Anything 2 – Fast and furious inference with low latency and fast cold starts Blog Accelerating Generative AI with PyTorch: Segment Anything 2 – Fast and furious inference with low latency and fast cold starts This post is a follow-up to our first entry in the multi-series blog focused on how…PyTorch FoundationFebruary 26, 2025
Optimize LLMs for Efficiency & Sustainability BlogCommunity Optimize LLMs for Efficiency & Sustainability The rapid growth of large language model (LLM) applications is linked to rapid growth in…Zach Lasiuk, ArmFebruary 19, 2025
Unlocking the Latest Features in PyTorch 2.6 for Intel Platforms Blog Unlocking the Latest Features in PyTorch 2.6 for Intel Platforms PyTorch* 2.6 has just been released with a set of exciting new features including torch.compile compatibility…the Intel PyTorch TeamFebruary 11, 2025
Enabling advanced GPU features in PyTorch – Warp Specialization Blog Enabling advanced GPU features in PyTorch – Warp Specialization Meta: Hongtao Yu, Manman Ren, Bert Maher, Shane NayNVIDIA: Gustav Zhu, Shuhao Jiang Over the…Meta and NVIDIAFebruary 5, 2025
PyTorch 2.6 Release Blog Blog PyTorch 2.6 Release Blog We are excited to announce the release of PyTorch® 2.6 (release notes)! This release features…PyTorch FoundationJanuary 29, 2025
2025 Priorities for the PyTorch Technical Advisory Council (TAC) Blog 2025 Priorities for the PyTorch Technical Advisory Council (TAC) 2024 has been a year of incredible growth for PyTorch. As that continues in 2025,…Luca Antiga, PyTorch TAC ChairJanuary 28, 2025
Bringing the PyTorch Community Together BlogCommunity Bringing the PyTorch Community Together As we step into a new year, it’s a great moment to reflect on the…Eli Uriegas, Meta and Jennifer Bly, PyTorch FoundationJanuary 22, 2025
Accelerating LLM Inference with GemLite, TorchAO and SGLang Blog Accelerating LLM Inference with GemLite, TorchAO and SGLang Large Language Models (LLMs) are typically very resource-intensive, requiring significant amounts of memory, compute and…Teams at PyTorch, Mobius Labs and SGLangJanuary 21, 2025