Accelerating Generative AI with PyTorch II: GPT, Fast Blog Accelerating Generative AI with PyTorch II: GPT, Fast This post is the second part of a multi-series blog focused on how to accelerate…PyTorch FoundationNovember 30, 2023
PyTorch 2.1 Contains New Performance Features for AI Developers Blog PyTorch 2.1 Contains New Performance Features for AI Developers We are excited to see the release of PyTorch 2.1. In this blog, we discuss…IntelNovember 29, 2023
Accelerating Generative AI with PyTorch: Segment Anything, Fast Blog Accelerating Generative AI with PyTorch: Segment Anything, Fast This post is the first part of a multi-series blog focused on how to accelerate…PyTorch FoundationNovember 16, 2023
How Activation Checkpointing enables scaling up training deep learning models Community How Activation Checkpointing enables scaling up training deep learning models Activation checkpointing is a technique used for reducing the memory footprint at the cost of…PyTorch FoundationNovember 9, 2023
PyTorch compile to speed up inference on Llama 2 Blog PyTorch compile to speed up inference on Llama 2 In this blog, we discuss how to improve the inference latencies of the Llama 2…IBM Research: Antoni Viros i Martin, Brian Vaughan, Davis Wertheimer, Joshua Rosenkranz, Mudhakar Srivatsa, Nelson Mimura Gonzalez, Raghu Ganti, Supriyo Chakraborty, Zhuoran Liu Meta: Geeta Chauhan, Hamid ShojanazeriNovember 7, 2023
High-Performance Llama 2 Training and Inference with PyTorch/XLA on Cloud TPUs Blog High-Performance Llama 2 Training and Inference with PyTorch/XLA on Cloud TPUs In a landscape where AI innovation is accelerating at an unprecedented pace, Meta’s Llama family of open…Jiewen Tan, Jon Bolin, Yeounoh Chung, Liyang Lu, Siyuan Liu, Wonjoo Lee, Manfei Bai, Meghan Cowan, Jack Cao, Milad Mohammadi, Shauheen Zahirazami, Alex SpiridonovNovember 6, 2023
Accelerating Inference on x86-64 Machines with oneDNN Graph Blog Accelerating Inference on x86-64 Machines with oneDNN Graph Supported in PyTorch 2.0 as a beta feature, oneDNN Graph leverages aggressive fusion patterns to…IntelNovember 2, 2023
AMD Extends Support for PyTorch Machine Learning Development on Select RDNA™ 3 GPUs with ROCm™ 5.7 Blog AMD Extends Support for PyTorch Machine Learning Development on Select RDNA™ 3 GPUs with ROCm™ 5.7 Researchers and developers working with Machine Learning (ML) models and algorithms using PyTorch can now…AMDOctober 31, 2023
torch.compile, explained Community torch.compile, explained Have you ever felt overwhelmed by the complexities of torch.compile? Diving into its workings can…Kaichao YouOctober 26, 2023
Compiling NumPy code into C++ or CUDA via torch.compile Blog Compiling NumPy code into C++ or CUDA via torch.compile Quansight engineers have implemented support for tracing through NumPy code via torch.compile in PyTorch 2.1. This feature…Evgeni Burovski, Ralf Gommers and Mario LezcanoOctober 17, 2023
Flash-Decoding for long-context inference Blog Flash-Decoding for long-context inference Motivation Large language models (LLM) such as ChatGPT or Llama have received unprecedented attention lately.…Tri Dao, Daniel Haziza, Francisco Massa, Grigory SizovOctober 13, 2023
ML Model Server Resource Saving – Transition From High-Cost GPUs to Intel CPUs and oneAPI powered Software with performance Blog ML Model Server Resource Saving – Transition From High-Cost GPUs to Intel CPUs and oneAPI powered Software with performance Reviewers: Yunsang Ju(Naver GplaceAI Leader), Min Jean Cho(Intel), Jing Xu(Intel), Mark Saroufim(Meta) Intro Here, We will…Sangjune Park(Naver GplaceAI MLOps), Jooyoung Lee(Naver GplaceAI MLE), Junho Min(Naver GplaceAI MLE)October 11, 2023
Real-time Audio-visual Speech Recognition Blog Real-time Audio-visual Speech Recognition Audio-Visual Speech Recognition (AV-ASR, or AVSR) is the task of transcribing text from audio and…PyTorch FoundationOctober 10, 2023
PyTorch 2.1: automatic dynamic shape compilation, distributed checkpointing Blog PyTorch 2.1: automatic dynamic shape compilation, distributed checkpointing We are excited to announce the release of PyTorch® 2.1 (release note)! PyTorch 2.1 offers…PyTorch FoundationOctober 4, 2023
New Library Updates in PyTorch 2.1 Blog New Library Updates in PyTorch 2.1 Summary We are bringing a number of improvements to the current PyTorch libraries, alongside the…PyTorch FoundationOctober 4, 2023
High performance Llama 2 deployments with AWS Inferentia2 using TorchServe Blog High performance Llama 2 deployments with AWS Inferentia2 using TorchServe Recently, Llama 2 was released and has attracted a lot of interest from the machine learning community. Amazon…Mike Zhang, Li Ning, Sergey Ivanov, Naman Nandan, Hamid Shojanazeri, Geeta Chauhan, Abhi Shivaditya, Michael Nguyen, Pinak PanigrahiOctober 4, 2023
How to Build an Interactive Chat-Generation Model using DialoGPT and PyTorch Blog How to Build an Interactive Chat-Generation Model using DialoGPT and PyTorch The focus on interactive chat-generation (or conversational response-generation) models has greatly increased in the past…IntelOctober 3, 2023
Inside the Matrix: Visualizing Matrix Multiplication, Attention and Beyond Blog Inside the Matrix: Visualizing Matrix Multiplication, Attention and Beyond Use 3D to visualize matrix multiplication expressions, attention heads with real weights, and more. Matrix…Basil HosmerSeptember 25, 2023
Accelerated CPU Inference with PyTorch Inductor using torch.compile Blog Accelerated CPU Inference with PyTorch Inductor using torch.compile Story at a Glance Although the PyTorch* Inductor C++/OpenMP* backend has enabled users to take…IntelSeptember 13, 2023
Automated trace collection and analysis Blog Automated trace collection and analysis In this blog, we share how we enabled the collection and analysis of PyTorch Profiler…Anupam Bhatnagar, Brian CoutinhoSeptember 5, 2023