PyTorch 2.1: automatic dynamic shape compilation, distributed checkpointing Blog PyTorch 2.1: automatic dynamic shape compilation, distributed checkpointing We are excited to announce the release of PyTorch® 2.1 (release note)! PyTorch 2.1 offers…PyTorch FoundationOctober 4, 2023
New Library Updates in PyTorch 2.1 Blog New Library Updates in PyTorch 2.1 Summary We are bringing a number of improvements to the current PyTorch libraries, alongside the…PyTorch FoundationOctober 4, 2023
High performance Llama 2 deployments with AWS Inferentia2 using TorchServe Blog High performance Llama 2 deployments with AWS Inferentia2 using TorchServe Recently, Llama 2 was released and has attracted a lot of interest from the machine learning community. Amazon…Mike Zhang, Li Ning, Sergey Ivanov, Naman Nandan, Hamid Shojanazeri, Geeta Chauhan, Abhi Shivaditya, Michael Nguyen, Pinak PanigrahiOctober 4, 2023
How to Build an Interactive Chat-Generation Model using DialoGPT and PyTorch Blog How to Build an Interactive Chat-Generation Model using DialoGPT and PyTorch The focus on interactive chat-generation (or conversational response-generation) models has greatly increased in the past…IntelOctober 3, 2023
Inside the Matrix: Visualizing Matrix Multiplication, Attention and Beyond Blog Inside the Matrix: Visualizing Matrix Multiplication, Attention and Beyond Use 3D to visualize matrix multiplication expressions, attention heads with real weights, and more. Matrix…Basil HosmerSeptember 25, 2023
Accelerated CPU Inference with PyTorch Inductor using torch.compile Blog Accelerated CPU Inference with PyTorch Inductor using torch.compile Story at a Glance Although the PyTorch* Inductor C++/OpenMP* backend has enabled users to take…IntelSeptember 13, 2023
Automated trace collection and analysis Blog Automated trace collection and analysis In this blog, we share how we enabled the collection and analysis of PyTorch Profiler…Anupam Bhatnagar, Brian CoutinhoSeptember 5, 2023
PyTorch/XLA SPMD: Scale Up Model Training and Serving with Automatic Parallelization Blog PyTorch/XLA SPMD: Scale Up Model Training and Serving with Automatic Parallelization Today, we are delighted to announce PyTorch/XLA SPMD: the integration of GSPMD into PyTorch with an easy…Yeounoh Chung, Jon Bolin, Milad Mohammadi, Jiewen Tan, Jack Cao, Joe Spisak, Alex Spiridonov, Shauheen Zahirazami, Steven Krawczyk, Wonjoo Lee Mohit Khatwani, Wanchao Liang, Vaibhav SinghAugust 31, 2023
Large Scale Training of Hugging Face Transformers on TPUs With PyTorch/XLA FSDP Blog Large Scale Training of Hugging Face Transformers on TPUs With PyTorch/XLA FSDP AI is transforming many industries through advanced capabilities such as understanding and generating language, answering…Alex Wertheim, Milad Mohammadi, Jack Cao, Alex Spiridonov, Joe Spisak, Lysandre Debut, Sylvain Gugger, Sourab MangrulkarAugust 24, 2023
INT8 Quantization for x86 CPU in PyTorch Blog INT8 Quantization for x86 CPU in PyTorch Overview INT8 quantization is a powerful technique for speeding up deep learning inference on x86…IntelAugust 7, 2023
Announcing CPP-based S3 IO DataPipes Blog Announcing CPP-based S3 IO DataPipes Training large deep learning models requires large datasets. Amazon Simple Storage Service (Amazon S3) is a scalable…John He, Khaled ElGalaind, Roshani Nagmote, Daiming YangJuly 25, 2023
How to Accelerate PyTorch Geometric on Intel® CPUs Blog How to Accelerate PyTorch Geometric on Intel® CPUs Overview The Intel PyTorch team has been collaborating with the PyTorch Geometric (PyG) community to…IntelJuly 10, 2023
Unveiling the Power of Semi-Supervised Learning: The Unified Semi-Supervised Learning Benchmark Community Unveiling the Power of Semi-Supervised Learning: The Unified Semi-Supervised Learning Benchmark Machine Learning models thrive on high-quality, fully-annotated data. The traditional supervised learning approach typically requires…Jindong WangJuly 6, 2023
Optimizing LibTorch-based inference engine memory usage and thread-pooling Blog Optimizing LibTorch-based inference engine memory usage and thread-pooling Outline In this blog post we show how to optimize LibTorch-based inference engine to maximize…Himalay Mohanlal Joriwal, Pierre-Yves Aquilanti, Vivek Govindan, Hamid Shojanazeri, Ankith Gunapal, Tristan RiceJune 29, 2023
Introducing TorchOpt: A High-Performance Differentiable Optimization Library for PyTorch Community Introducing TorchOpt: A High-Performance Differentiable Optimization Library for PyTorch Explore TorchOpt, a PyTorch-based library that revolutionizes differentiable optimization with its unified programming abstraction, high-performance…Benjamin LiuJune 29, 2023
The Path to Achieve Ultra-Low Inference Latency With LLaMA 65B on PyTorch/XLA Blog The Path to Achieve Ultra-Low Inference Latency With LLaMA 65B on PyTorch/XLA Background & State of the Art In the natural language processing (NLP) space, language models…Milad Mohammadi, Jiewen Tan, Liyang Lu, Siyuan Liu, Yeounoh Chung, Wonjoo Lee, Manfei Bai, Steven Krawczyk, Shauheen Zahirazami, Alex Wertheim, Meghan Cowan, Jack Cao, Joe SpisakJune 28, 2023
Optimized PyTorch 2.0 Inference with AWS Graviton processors Blog Optimized PyTorch 2.0 Inference with AWS Graviton processors New generations of CPUs offer significant performance improvement in machine learning (ML) inference due to…Sunita Nadampalli from AWS & Ankith Gunapal from MetaJune 22, 2023
🎉 PyTorch Docathon H1 2023 Wrap-up 🎉 Blog 🎉 PyTorch Docathon H1 2023 Wrap-up 🎉 Thank you to all who participated in our first ever PyTorch Docathon, the results have…PyTorch FoundationJune 16, 2023
Out of the box acceleration and memory savings of 🤗 decoder models with PyTorch 2.0 Blog Out of the box acceleration and memory savings of 🤗 decoder models with PyTorch 2.0 As part of PyTorch 2.0 release, an accelerated implementation of the attention mechanism as part…Felix Marty, Younes Belkada, Hamid Shojanazeri, Driss GuessousMay 22, 2023
Language Identification: Building an End-to-End AI Solution using PyTorch Blog Language Identification: Building an End-to-End AI Solution using PyTorch Language Identification is the process of identifying the primary language from multiple audio input samples.…IntelMay 12, 2023