Blog

Blog

PyTorch Foundation Technical Advisory Council Elects New Leadership

We are pleased to announce the first-ever Chair and Vice Chair of the PyTorch Foundation’s…

PyTorch FoundationOctober 8, 2024

Blog

PyTorch Conference 2024 Recap: On Fire 🔥

The 2024 PyTorch Conference in San Francisco gathered nearly 1,500 AI researchers, developers, and enthusiasts.…

Jennifer Bly, PyTorch FoundationOctober 2, 2024

Case Studies

Using PyTorch for Monocular Depth Estimation Webinar

In this webinar, Bob Chesebrough of Intel guides you through the steps he took to…

PyTorch FoundationSeptember 27, 2024

Announcements

PyTorch Native Architecture Optimization: torchao

We’re happy to officially launch torchao, a PyTorch native library that makes models faster and…

PyTorch FoundationSeptember 26, 2024

Blog

Challenges and Efforts in PyTorch Multi-Device Integration: Compatibility, Portability, and Integration Efficiencies

Introduction As the demand for diverse hardware accelerators grows, the need for a robust and…

Zesheng Zong (Huawei), Jiawei Li (Huawei) | Co-authors: Jiong Gong (Intel), Bartosz Sochacki (Intel), Eikan Wang (Intel)September 18, 2024

Announcements

Arm Joins the PyTorch Foundation as a Premier Member

The PyTorch Foundation, a neutral home for the deep learning community to collaborate on the…

The PyTorch FoundationSeptember 12, 2024

Announcements Community

PyTorch Shanghai Meetup Notes

Summary We are honored to successfully host the PyTorch Shanghai Meetup on August 15, 2024.…

PyTorch FoundationSeptember 8, 2024

Blog

CUDA-Free Inference for LLMs

In this blog, we discuss the methods we used to achieve FP16 inference with popular…

Adnan Hoque, Less Wright, Raghu Ganti and Mudhakar SrivatsaSeptember 4, 2024

Blog

Accelerate Your AI: PyTorch 2.4 Now Supports Intel GPUs for Faster Workloads

We have exciting news! PyTorch 2.4 now supports Intel® Data Center GPU Max Series and…

the PyTorch Team at IntelAugust 29, 2024

Blog

Enabling Fast Gradient Clipping and Ghost Clipping in Opacus

Introduction and Context Differentially Private Stochastic Gradient Descent (DP-SGD) is the canonical method for training machine…

Enayat Ullah, Huanyu Zhang, Will Bullock, Ilya MironovAugust 20, 2024

Blog

FlexAttention: The Flexibility of PyTorch with the Performance of FlashAttention

In theory, Attention is All You Need. In practice, however, we also need optimized attention…

Team PyTorch: Driss Guessous, Yanbo Liang, Joy Dong, Horace HeAugust 7, 2024

Blog

Quantization-Aware Training for Large Language Models with PyTorch

In this blog, we present an end-to-end Quantization-Aware Training (QAT) flow for large language models…

Andrew Or, Jerry Zhang, Evan Smothers, Kartikay Khandelwal, Supriya RaoJuly 30, 2024

Announcements

Introducing torchchat: Accelerating Local LLM Inference on Laptop, Desktop and Mobile

Today, we’re releasing torchchat, a library showcasing how to seamlessly and performantly run Llama 3, 3.1,…

PyTorch FoundationJuly 30, 2024

Blog

PyTorch 2.4 Release Blog

We are excited to announce the release of PyTorch® 2.4 (release note)! PyTorch 2.4 adds…

PyTorch FoundationJuly 24, 2024

Blog

Deep Dive on the Hopper TMA Unit for FP8 GEMMs

Abstract The Hopper (H100) GPU architecture, billed as the “first truly asynchronous GPU”, includes a…

Adnan Hoque, Less Wright, Chih-Chieh YangJuly 22, 2024

Blog

FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision

Attention, as a core layer of the ubiquitous Transformer architecture, is a bottleneck for large…

Jay Shah and Ganesh Bikshandi, Colfax Research, Ying Zhang, Meta, Vijay Thakkar and Pradeep Ramani, NVIDIA, Tri Dao, TogetherAI and Princeton UniversityJuly 11, 2024

Blog

Learn how to develop Android applications with ExecuTorch and Llama models

This blog is courtesy of the PyTorch team at Arm. More details can be found here.…

ArmJuly 10, 2024

Blog

Accelerated PyTorch inference with torch.compile on AWS Graviton processors

Summary Originally PyTorch, used an eager mode where each PyTorch operation that forms the model…

Sunita NadampalliJuly 9, 2024

Announcements Blog

Announcing Hacker Cup AI Track at NeurIPS 2024

The PyTorch team in partnership with Meta Hacker Cup, and Microsoft Research, are excited to…

PyTorch FoundationJuly 3, 2024

Announcements

Powering the AI Revolution: The PyTorch Documentary

Now live: The official PyTorch Documentary! This film unveils the authentic narrative of PyTorch’s inception, attributing…

The PyTorch FoundationJune 25, 2024

PyTorch Foundation Technical Advisory Council Elects New Leadership

PyTorch Conference 2024 Recap: On Fire 🔥

Using PyTorch for Monocular Depth Estimation Webinar

PyTorch Native Architecture Optimization: torchao

Challenges and Efforts in PyTorch Multi-Device Integration: Compatibility, Portability, and Integration Efficiencies

Arm Joins the PyTorch Foundation as a Premier Member

PyTorch Shanghai Meetup Notes

CUDA-Free Inference for LLMs

Accelerate Your AI: PyTorch 2.4 Now Supports Intel GPUs for Faster Workloads

Enabling Fast Gradient Clipping and Ghost Clipping in Opacus

FlexAttention: The Flexibility of PyTorch with the Performance of FlashAttention

Quantization-Aware Training for Large Language Models with PyTorch

Introducing torchchat: Accelerating Local LLM Inference on Laptop, Desktop and Mobile

PyTorch 2.4 Release Blog

Deep Dive on the Hopper TMA Unit for FP8 GEMMs

FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision

Learn how to develop Android applications with ExecuTorch and Llama models

Accelerated PyTorch inference with torch.compile on AWS Graviton processors

Announcing Hacker Cup AI Track at NeurIPS 2024

Powering the AI Revolution: The PyTorch Documentary

Docs

Tutorials

Resources

Stay in touch for updates, event info, and the latest news