PyTorch 2.3 Release Blog Blog PyTorch 2.3 Release Blog We are excited to announce the release of PyTorch® 2.3 (release note)! PyTorch 2.3 offers…PyTorch FoundationApril 24, 2024
Accelerating MoE model inference with Locality-Aware Kernel Design Blog Accelerating MoE model inference with Locality-Aware Kernel Design 1.0 Summary We show that by implementing column-major scheduling to improve data locality, we can…Adnan Hoque, Less Wright, Antoni Virós Martin, Chih-Chieh YangApril 4, 2024
Maximizing training throughput using PyTorch FSDP Blog Maximizing training throughput using PyTorch FSDP In this blog, we demonstrate the scalability of FSDP with a pre-training exemplar, a 7B…Team PyTorch at IBM and Team PyTorch at MetaMarch 13, 2024
Exploring scientific machine learning pipelines through the SimulAI toolkit Community Exploring scientific machine learning pipelines through the SimulAI toolkit SciML, short for Scientific Machine Learning, encompasses work that merges quantitative sciences with machine learning.…Joao Lucas de Sousa AlmeidaFebruary 15, 2024
Colossal-LLaMA-2: Low Cost and High-quality Domain-specific LLM Solution Using LLaMA and Colossal-AI Community Colossal-LLaMA-2: Low Cost and High-quality Domain-specific LLM Solution Using LLaMA and Colossal-AI The most prominent distinction between LLaMA-1 and LLaMA-2 lies in the incorporation of higher-quality corpora,…Yang YouJanuary 29, 2024
3D rotations and spatial transformations made easy with RoMa Community 3D rotations and spatial transformations made easy with RoMa Struggling with quaternions, rotation vectors, right-hand rules and all these stuffs? Try RoMa: an easy-to-to-use,…Romain BrégierJanuary 25, 2024
Accelerating Generative AI with PyTorch IV: Seamless M4T, fast Blog Accelerating Generative AI with PyTorch IV: Seamless M4T, fast This post is the fourth part of a multi-series blog focused on how to accelerate…Yejin Lee, Carole-Jean Wu, Christian Puhrsch, Joel Schlosser, Driss Guessous, Jeffrey Wan, Joe Isaacson, Can Balioglu, Juan PinoJanuary 23, 2024
Accelerate PyTorch Models Using Quantization Techniques with Intel Extension for PyTorch Blog Accelerate PyTorch Models Using Quantization Techniques with Intel Extension for PyTorch Overview PyTorch is a Python-based framework for developing deep learning models. It is one of…IntelJanuary 18, 2024
Accelerating Triton Dequantization Kernels for GPTQ Blog Accelerating Triton Dequantization Kernels for GPTQ TL;DR Leveraging a first principles approach, we showcase a step by step process undertaken to…Less Wright, Adnan Hoque (IBM)January 16, 2024
Finetune LLMs on your own consumer hardware using tools from PyTorch and Hugging Face ecosystem Blog Finetune LLMs on your own consumer hardware using tools from PyTorch and Hugging Face ecosystem We demonstrate how to finetune a 7B parameter model on a typical consumer GPU (NVIDIA…Younes Belkada, Marc Sun, Titus von Köller, Sourab Mangrulkar, Benjamin Bossan, Lysandre Debut, Steven LiuJanuary 10, 2024
Accelerate AI models on GPU using Amazon SageMaker multi-model endpoints with TorchServe, saving up to 75% on inference costs Blog Accelerate AI models on GPU using Amazon SageMaker multi-model endpoints with TorchServe, saving up to 75% on inference costs Multi-model endpoints (MMEs) are a powerful feature of Amazon SageMaker designed to simplify the deployment and operation…James Wu, Ankith Gunapal, Li Ning, Subhash Talluri, and Saurabh TrikandeJanuary 9, 2024
torchdistill — a modular, configuration-driven framework for reproducible deep learning and knowledge distillation experiments Community torchdistill — a modular, configuration-driven framework for reproducible deep learning and knowledge distillation experiments This article summarizes key features and concepts of torchdistill (v1.0.0). Refer to the official documentation…Yoshitomo Matsubara Yoshitomo Matsubara Follow Yoshitomo Matsubara 16 Followers ex-Applied Scientist at Amazon and an ML OSS developer. PhD in Computer Science. https://yoshitomo-matsubara.net/January 4, 2024
Accelerating Generative AI Part III: Diffusion, Fast Blog Accelerating Generative AI Part III: Diffusion, Fast This post is the third part of a multi-series blog focused on how to accelerate…Sayak Paul and Patrick von Platen (Hugging Face 🤗)January 3, 2024
Understanding GPU Memory 2: Finding and Removing Reference Cycles Blog Understanding GPU Memory 2: Finding and Removing Reference Cycles This is part 2 of the Understanding GPU Memory blog series. Our first post Understanding GPU…Aaron Shi, Zachary DeVitoDecember 19, 2023
Training Production AI Models with PyTorch 2.0 Blog Training Production AI Models with PyTorch 2.0 1. Introduction PyTorch 2.0 (abbreviated as PT2) can significantly improve the training and inference performance of…CK Luk, Daohang Shi, Yuzhen Huang, Jackie (Jiaqi) Xu, Jade Nie, Zhou Wang, Lu Fang, Flavio Sales Truzzi, Devashish Shankar, Dima Ivashchenko, Chunzhi Yang, Nicolas Macchioni, David Berard, Yu Guo, Xiaodong Wang, Bert Maher, Yanbo Liang, Edward Yang, Brian Hirsh, Michael Voznesensky, Animesh Jain, Michael AndersonDecember 18, 2023
Empowering Models with Performance: The Art of Generalized Model Transformation Approach Blog Empowering Models with Performance: The Art of Generalized Model Transformation Approach Introduction PyTorch 2.0 (PT2) offers a compiled execution mode which rewrites Python bytecode to extract sequences…Jackie (Jiaqi) Xu, Yanbo Liang, Jason Ansel, Chunzhi Yang, Jade Nie, Yuzhen Huang, CK Luk, Xiaodong Wang, Lu Fang, Menglu Yu, Jinwon Lee, Daohang Shi, Flavio Sales TruzziDecember 15, 2023
Understanding GPU Memory 1: Visualizing All Allocations over Time Blog Understanding GPU Memory 1: Visualizing All Allocations over Time During your time with PyTorch on GPUs, you may be familiar with this common error…Aaron Shi, Zachary DeVitoDecember 14, 2023
From PyTorch Conference 2023: From Dinosaurs to Seismic Imaging with Intel Blog From PyTorch Conference 2023: From Dinosaurs to Seismic Imaging with Intel Lightning Talk 1: Seismic Data to Subsurface Models with OpenFWI Speaker: Benjamin Consolvo, AI Software…Ramya Ravi, Susan Kahler at IntelDecember 12, 2023
PyPose: A Library for Robot Learning with Physics-based Optimization Community PyPose: A Library for Robot Learning with Physics-based Optimization We are excited to share our new open-source library PyPose. It is a PyTorch-based robotics-oriented…PyPoseDecember 6, 2023
Accelerating Generative AI with PyTorch II: GPT, Fast Blog Accelerating Generative AI with PyTorch II: GPT, Fast This post is the second part of a multi-series blog focused on how to accelerate…PyTorch FoundationNovember 30, 2023