PyTorch Foundation Technical Advisory Council Elects New Leadership Blog PyTorch Foundation Technical Advisory Council Elects New Leadership We are pleased to announce the first-ever Chair and Vice Chair of the PyTorch Foundation’s…PyTorch FoundationOctober 8, 2024
PyTorch Conference 2024 Recap: On Fire 🔥 Blog PyTorch Conference 2024 Recap: On Fire 🔥 The 2024 PyTorch Conference in San Francisco gathered nearly 1,500 AI researchers, developers, and enthusiasts.…Jennifer Bly, PyTorch FoundationOctober 2, 2024
Using PyTorch for Monocular Depth Estimation Webinar Case Studies Using PyTorch for Monocular Depth Estimation Webinar In this webinar, Bob Chesebrough of Intel guides you through the steps he took to…PyTorch FoundationSeptember 27, 2024
PyTorch Native Architecture Optimization: torchao Announcements PyTorch Native Architecture Optimization: torchao We’re happy to officially launch torchao, a PyTorch native library that makes models faster and…PyTorch FoundationSeptember 26, 2024
Challenges and Efforts in PyTorch Multi-Device Integration: Compatibility, Portability, and Integration Efficiencies Blog Challenges and Efforts in PyTorch Multi-Device Integration: Compatibility, Portability, and Integration Efficiencies Introduction As the demand for diverse hardware accelerators grows, the need for a robust and…Zesheng Zong (Huawei), Jiawei Li (Huawei) | Co-authors: Jiong Gong (Intel), Bartosz Sochacki (Intel), Eikan Wang (Intel)September 18, 2024
Arm Joins the PyTorch Foundation as a Premier Member Announcements Arm Joins the PyTorch Foundation as a Premier Member The PyTorch Foundation, a neutral home for the deep learning community to collaborate on the…The PyTorch FoundationSeptember 12, 2024
PyTorch Shanghai Meetup Notes AnnouncementsCommunity PyTorch Shanghai Meetup Notes Summary We are honored to successfully host the PyTorch Shanghai Meetup on August 15, 2024.…PyTorch FoundationSeptember 8, 2024
CUDA-Free Inference for LLMs Blog CUDA-Free Inference for LLMs In this blog, we discuss the methods we used to achieve FP16 inference with popular…Adnan Hoque, Less Wright, Raghu Ganti and Mudhakar SrivatsaSeptember 4, 2024
Accelerate Your AI: PyTorch 2.4 Now Supports Intel GPUs for Faster Workloads Blog Accelerate Your AI: PyTorch 2.4 Now Supports Intel GPUs for Faster Workloads We have exciting news! PyTorch 2.4 now supports Intel® Data Center GPU Max Series and…the PyTorch Team at IntelAugust 29, 2024
Enabling Fast Gradient Clipping and Ghost Clipping in Opacus Blog Enabling Fast Gradient Clipping and Ghost Clipping in Opacus Introduction and Context Differentially Private Stochastic Gradient Descent (DP-SGD) is the canonical method for training machine…Enayat Ullah, Huanyu Zhang, Will Bullock, Ilya MironovAugust 20, 2024
FlexAttention: The Flexibility of PyTorch with the Performance of FlashAttention Blog FlexAttention: The Flexibility of PyTorch with the Performance of FlashAttention In theory, Attention is All You Need. In practice, however, we also need optimized attention…Team PyTorch: Driss Guessous, Yanbo Liang, Joy Dong, Horace HeAugust 7, 2024
Quantization-Aware Training for Large Language Models with PyTorch Blog Quantization-Aware Training for Large Language Models with PyTorch In this blog, we present an end-to-end Quantization-Aware Training (QAT) flow for large language models…Andrew Or, Jerry Zhang, Evan Smothers, Kartikay Khandelwal, Supriya RaoJuly 30, 2024
Introducing torchchat: Accelerating Local LLM Inference on Laptop, Desktop and Mobile Announcements Introducing torchchat: Accelerating Local LLM Inference on Laptop, Desktop and Mobile Today, we’re releasing torchchat, a library showcasing how to seamlessly and performantly run Llama 3, 3.1,…PyTorch FoundationJuly 30, 2024
PyTorch 2.4 Release Blog Blog PyTorch 2.4 Release Blog We are excited to announce the release of PyTorch® 2.4 (release note)! PyTorch 2.4 adds…PyTorch FoundationJuly 24, 2024
Deep Dive on the Hopper TMA Unit for FP8 GEMMs Blog Deep Dive on the Hopper TMA Unit for FP8 GEMMs Abstract The Hopper (H100) GPU architecture, billed as the “first truly asynchronous GPU”, includes a…Adnan Hoque, Less Wright, Chih-Chieh YangJuly 22, 2024
FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision Blog FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision Attention, as a core layer of the ubiquitous Transformer architecture, is a bottleneck for large…Jay Shah and Ganesh Bikshandi, Colfax Research, Ying Zhang, Meta, Vijay Thakkar and Pradeep Ramani, NVIDIA, Tri Dao, TogetherAI and Princeton UniversityJuly 11, 2024
Learn how to develop Android applications with ExecuTorch and Llama models Blog Learn how to develop Android applications with ExecuTorch and Llama models This blog is courtesy of the PyTorch team at Arm. More details can be found here.…ArmJuly 10, 2024
Accelerated PyTorch inference with torch.compile on AWS Graviton processors Blog Accelerated PyTorch inference with torch.compile on AWS Graviton processors Summary Originally PyTorch, used an eager mode where each PyTorch operation that forms the model…Sunita NadampalliJuly 9, 2024
Announcing Hacker Cup AI Track at NeurIPS 2024 AnnouncementsBlog Announcing Hacker Cup AI Track at NeurIPS 2024 The PyTorch team in partnership with Meta Hacker Cup, and Microsoft Research, are excited to…PyTorch FoundationJuly 3, 2024
Powering the AI Revolution: The PyTorch Documentary Announcements Powering the AI Revolution: The PyTorch Documentary Now live: The official PyTorch Documentary! This film unveils the authentic narrative of PyTorch’s inception, attributing…The PyTorch FoundationJune 25, 2024