Accelerating Generative AI Part III: Diffusion, Fast Blog Accelerating Generative AI Part III: Diffusion, Fast This post is the third part of a multi-series blog focused on how to accelerate…Sayak Paul and Patrick von Platen (Hugging Face 🤗)January 3, 2024
Understanding GPU Memory 2: Finding and Removing Reference Cycles Blog Understanding GPU Memory 2: Finding and Removing Reference Cycles This is part 2 of the Understanding GPU Memory blog series. Our first post Understanding GPU…Aaron Shi, Zachary DeVitoDecember 19, 2023
Training Production AI Models with PyTorch 2.0 Blog Training Production AI Models with PyTorch 2.0 1. Introduction PyTorch 2.0 (abbreviated as PT2) can significantly improve the training and inference performance of…CK Luk, Daohang Shi, Yuzhen Huang, Jackie (Jiaqi) Xu, Jade Nie, Zhou Wang, Lu Fang, Flavio Sales Truzzi, Devashish Shankar, Dima Ivashchenko, Chunzhi Yang, Nicolas Macchioni, David Berard, Yu Guo, Xiaodong Wang, Bert Maher, Yanbo Liang, Edward Yang, Brian Hirsh, Michael Voznesensky, Animesh Jain, Michael AndersonDecember 18, 2023
Empowering Models with Performance: The Art of Generalized Model Transformation Approach Blog Empowering Models with Performance: The Art of Generalized Model Transformation Approach Introduction PyTorch 2.0 (PT2) offers a compiled execution mode which rewrites Python bytecode to extract sequences…Jackie (Jiaqi) Xu, Yanbo Liang, Jason Ansel, Chunzhi Yang, Jade Nie, Yuzhen Huang, CK Luk, Xiaodong Wang, Lu Fang, Menglu Yu, Jinwon Lee, Daohang Shi, Flavio Sales TruzziDecember 15, 2023
Understanding GPU Memory 1: Visualizing All Allocations over Time Blog Understanding GPU Memory 1: Visualizing All Allocations over Time During your time with PyTorch on GPUs, you may be familiar with this common error…Aaron Shi, Zachary DeVitoDecember 14, 2023
From PyTorch Conference 2023: From Dinosaurs to Seismic Imaging with Intel Blog From PyTorch Conference 2023: From Dinosaurs to Seismic Imaging with Intel Lightning Talk 1: Seismic Data to Subsurface Models with OpenFWI Speaker: Benjamin Consolvo, AI Software…Ramya Ravi, Susan Kahler at IntelDecember 12, 2023
PyPose: A Library for Robot Learning with Physics-based Optimization Community PyPose: A Library for Robot Learning with Physics-based Optimization We are excited to share our new open-source library PyPose. It is a PyTorch-based robotics-oriented…PyPoseDecember 6, 2023
Accelerating Generative AI with PyTorch II: GPT, Fast Blog Accelerating Generative AI with PyTorch II: GPT, Fast This post is the second part of a multi-series blog focused on how to accelerate…PyTorch FoundationNovember 30, 2023
PyTorch 2.1 Contains New Performance Features for AI Developers Blog PyTorch 2.1 Contains New Performance Features for AI Developers We are excited to see the release of PyTorch 2.1. In this blog, we discuss…IntelNovember 29, 2023
Accelerating Generative AI with PyTorch: Segment Anything, Fast Blog Accelerating Generative AI with PyTorch: Segment Anything, Fast This post is the first part of a multi-series blog focused on how to accelerate…PyTorch FoundationNovember 16, 2023
How Activation Checkpointing enables scaling up training deep learning models Community How Activation Checkpointing enables scaling up training deep learning models Activation checkpointing is a technique used for reducing the memory footprint at the cost of…PyTorch FoundationNovember 9, 2023
PyTorch compile to speed up inference on Llama 2 Blog PyTorch compile to speed up inference on Llama 2 In this blog, we discuss how to improve the inference latencies of the Llama 2…IBM Research: Antoni Viros i Martin, Brian Vaughan, Davis Wertheimer, Joshua Rosenkranz, Mudhakar Srivatsa, Nelson Mimura Gonzalez, Raghu Ganti, Supriyo Chakraborty, Zhuoran Liu Meta: Geeta Chauhan, Hamid ShojanazeriNovember 7, 2023
High-Performance Llama 2 Training and Inference with PyTorch/XLA on Cloud TPUs Blog High-Performance Llama 2 Training and Inference with PyTorch/XLA on Cloud TPUs In a landscape where AI innovation is accelerating at an unprecedented pace, Meta’s Llama family of open…Jiewen Tan, Jon Bolin, Yeounoh Chung, Liyang Lu, Siyuan Liu, Wonjoo Lee, Manfei Bai, Meghan Cowan, Jack Cao, Milad Mohammadi, Shauheen Zahirazami, Alex SpiridonovNovember 6, 2023
Accelerating Inference on x86-64 Machines with oneDNN Graph Blog Accelerating Inference on x86-64 Machines with oneDNN Graph Supported in PyTorch 2.0 as a beta feature, oneDNN Graph leverages aggressive fusion patterns to…IntelNovember 2, 2023
AMD Extends Support for PyTorch Machine Learning Development on Select RDNAâ„¢ 3 GPUs with ROCmâ„¢ 5.7 Blog AMD Extends Support for PyTorch Machine Learning Development on Select RDNAâ„¢ 3 GPUs with ROCmâ„¢ 5.7 Researchers and developers working with Machine Learning (ML) models and algorithms using PyTorch can now…AMDOctober 31, 2023
torch.compile, explained Community torch.compile, explained Have you ever felt overwhelmed by the complexities of torch.compile? Diving into its workings can…Kaichao YouOctober 26, 2023
Compiling NumPy code into C++ or CUDA via torch.compile Blog Compiling NumPy code into C++ or CUDA via torch.compile Quansight engineers have implemented support for tracing through NumPy code via torch.compile in PyTorch 2.1. This feature…Evgeni Burovski, Ralf Gommers and Mario LezcanoOctober 17, 2023
Flash-Decoding for long-context inference Blog Flash-Decoding for long-context inference Motivation Large language models (LLM) such as ChatGPT or Llama have received unprecedented attention lately.…Tri Dao, Daniel Haziza, Francisco Massa, Grigory SizovOctober 13, 2023
ML Model Server Resource Saving – Transition From High-Cost GPUs to Intel CPUs and oneAPI powered Software with performance Blog ML Model Server Resource Saving – Transition From High-Cost GPUs to Intel CPUs and oneAPI powered Software with performance Reviewers:Â Yunsang Ju(Naver GplaceAI Leader), Min Jean Cho(Intel), Jing Xu(Intel), Mark Saroufim(Meta) Intro Here, We will…Sangjune Park(Naver GplaceAI MLOps), Jooyoung Lee(Naver GplaceAI MLE), Junho Min(Naver GplaceAI MLE)October 11, 2023
Real-time Audio-visual Speech Recognition Blog Real-time Audio-visual Speech Recognition Audio-Visual Speech Recognition (AV-ASR, or AVSR) is the task of transcribing text from audio and…PyTorch FoundationOctober 10, 2023