Speeding up ViTs using Block Sparsity Blog Speeding up ViTs using Block Sparsity TLDR: We show promising results of up to a 1.46x speedup with <2% drop in accuracy on float32…FAIR at Meta: Mostafa Elhoushi, Sensors and Systems at Meta Reality Labs Research: Syed Shakib Sarwar, Aaryan Kothapalli, Mia Kasperek, Barbara De Salvo, PyTorch at Meta: Christian Puhrsch, Jesse Cai, Joe Isaacson, Quantsight: Andrew James, Pearu Peterson, Nikita VedeneevMay 14, 2024
Introducing depyf: mastering torch.compile with ease Community Introducing depyf: mastering torch.compile with ease We are thrilled to introduce depyf, a new project to the PyTorch ecosystem designed to help…Kaichao YouMay 11, 2024
Deep Learning Energy Measurement and Optimization Community Deep Learning Energy Measurement and Optimization This post is authored by Jae-Won Chung, a PhD student at the University of Michigan and…Jae-Won ChungMay 11, 2024
Enhancing Deep Learning Workflows: PyTorch Ecosystem Tools AnnouncementsCommunity Enhancing Deep Learning Workflows: PyTorch Ecosystem Tools Welcome to the thriving PyTorch ecosystem, where a wealth of tools and libraries await, purpose-built…PyTorch FoundationMay 11, 2024
A Hitchhiker’s Guide to Speculative Decoding Blog A Hitchhiker’s Guide to Speculative Decoding Speculative decoding is an optimization technique for inference that makes educated guesses about future tokens…Team PyTorch at IBMMay 2, 2024
Announcing PyTorch Docathon June, 2024 Announcements Announcing PyTorch Docathon June, 2024 We are thrilled to announce the upcoming PyTorch Docathon in June! The Docathon, akin to…PyTorch FoundationMay 2, 2024
Accelerating Llama3 FP8 Inference with Triton Kernels Blog Accelerating Llama3 FP8 Inference with Triton Kernels 1.0 Summary We present an optimized Triton FP8 GEMM (General Matrix-Matrix Multiply) kernel TK-GEMM, which…Adnan Hoque, Less Wright, Chih Chieh YangMay 1, 2024
ExecuTorch Alpha: Taking LLMs and AI to the Edge with Our Community and Partners Blog ExecuTorch Alpha: Taking LLMs and AI to the Edge with Our Community and Partners We are excited to announce the release of ExecuTorch alpha, focused on deploying large language models…PyTorch FoundationApril 30, 2024
PyTorch 2.3 Release Blog Blog PyTorch 2.3 Release Blog We are excited to announce the release of PyTorch® 2.3 (release note)! PyTorch 2.3 offers…PyTorch FoundationApril 24, 2024
torchtune: Easily fine-tune LLMs using PyTorch Announcements torchtune: Easily fine-tune LLMs using PyTorch We’re pleased to announce the alpha release of torchtune, a PyTorch-native library for easily fine-tuning…PyTorch FoundationApril 16, 2024
Accelerating MoE model inference with Locality-Aware Kernel Design Blog Accelerating MoE model inference with Locality-Aware Kernel Design 1.0 Summary We show that by implementing column-major scheduling to improve data locality, we can…Adnan Hoque, Less Wright, Antoni Virós Martin, Chih-Chieh YangApril 4, 2024
Maximizing training throughput using PyTorch FSDP Blog Maximizing training throughput using PyTorch FSDP In this blog, we demonstrate the scalability of FSDP with a pre-training exemplar, a 7B…Team PyTorch at IBM and Team PyTorch at MetaMarch 13, 2024
Exploring scientific machine learning pipelines through the SimulAI toolkit Community Exploring scientific machine learning pipelines through the SimulAI toolkit SciML, short for Scientific Machine Learning, encompasses work that merges quantitative sciences with machine learning.…Joao Lucas de Sousa AlmeidaFebruary 15, 2024
PyTorch 2 paper and tutorial @ ASPLOS 2024 Announcements PyTorch 2 paper and tutorial @ ASPLOS 2024 The PyTorch team is excited to share that our paper on PyTorch 2 has been…PyTorch FoundationFebruary 6, 2024
What’s New in PyTorch Documentation Announcements What’s New in PyTorch Documentation Greetings to the PyTorch community! Here is a quick update on PyTorch docs. In November…PyTorch FoundationFebruary 1, 2024
PyTorch 2.2: FlashAttention-v2 integration, AOTInductor Announcements PyTorch 2.2: FlashAttention-v2 integration, AOTInductor We are excited to announce the release of PyTorch® 2.2 (release note)! PyTorch 2.2 offers…PyTorch FoundationJanuary 30, 2024
New Library Updates in PyTorch 2.2 Announcements New Library Updates in PyTorch 2.2 Summary We are bringing a number of improvements to the current PyTorch libraries, alongside the…PyTorch FoundationJanuary 30, 2024
Colossal-LLaMA-2: Low Cost and High-quality Domain-specific LLM Solution Using LLaMA and Colossal-AI Community Colossal-LLaMA-2: Low Cost and High-quality Domain-specific LLM Solution Using LLaMA and Colossal-AI The most prominent distinction between LLaMA-1 and LLaMA-2 lies in the incorporation of higher-quality corpora,…Yang YouJanuary 29, 2024
3D rotations and spatial transformations made easy with RoMa Community 3D rotations and spatial transformations made easy with RoMa Struggling with quaternions, rotation vectors, right-hand rules and all these stuffs? Try RoMa: an easy-to-to-use,…Romain BrégierJanuary 25, 2024
Accelerating Generative AI with PyTorch IV: Seamless M4T, fast Blog Accelerating Generative AI with PyTorch IV: Seamless M4T, fast This post is the fourth part of a multi-series blog focused on how to accelerate…Yejin Lee, Carole-Jean Wu, Christian Puhrsch, Joel Schlosser, Driss Guessous, Jeffrey Wan, Joe Isaacson, Can Balioglu, Juan PinoJanuary 23, 2024