August 18, 2021

PipeTransformer: Automated Elastic Pipelining for Distributed Training of Large-scale Models

In this blog post, we describe the first peer-reviewed research paper that explores accelerating the hybrid of PyTorch DDP (torch.nn.parallel.DistributedDataParallel) [1] and Pipeline (torch.distributed.pipeline) - PipeTransformer: Automated Elastic Pipelining for Distributed Training of Large-scale Models (Transformers such as BERT [2] and ViT [3]), published at ICML 2021.

Read More

August 03, 2021

What’s New in PyTorch Profiler 1.9?

PyTorch Profiler v1.9 has been released! The goal of this new release (previous PyTorch Profiler release) is to provide you with new state-of-the-art tools to help diagnose and fix machine learning performance issues regardless of whether you are working on one or numerous machines. The objective is to target the execution steps that are the most costly in time and/or memory, and visualize the work load distribution between GPUs and CPUs.

Read More

June 27, 2021

Everything You Need To Know About Torchvision’s SSDlite Implementation

In the previous article, we’ve discussed how the SSD algorithm works, covered its implementation details and presented its training process. If you have not read the previous blog post, I encourage you to check it out before continuing.

Read More

June 23, 2021

The torch.linalg module: Accelerated Linear Algebra with Autograd in PyTorch

Linear algebra is essential to deep learning and scientific computing, and it’s always been a core part of PyTorch. PyTorch 1.9 extends PyTorch’s support for linear algebra operations with the torch.linalg module. This module, documented here, has 26 operators, including faster and easier to use versions of older PyTorch operators, every function from NumPy’s linear algebra module extended with accelerator and autograd support, and a few operators that are completely new. This makes the torch...

Read More

June 18, 2021

An Overview of the PyTorch Mobile Demo Apps

PyTorch Mobile provides a runtime environment to execute state-of-the-art machine learning models on mobile devices. Latency is reduced, privacy preserved, and models can run on mobile devices anytime, anywhere.

Read More

June 16, 2021

Everything You Need To Know About Torchvision’s SSD Implementation

In TorchVision v0.10, we’ve released two new Object Detection models based on the SSD architecture. Our plan is to cover the key implementation details of the algorithms along with information on how they were trained in a two-part article.

Read More

June 15, 2021

PyTorch 1.9 Release, including torch.linalg and Mobile Interpreter

We are excited to announce the release of PyTorch 1.9. The release is composed of more than 3,400 commits since 1.8, made by 398 contributors. The release notes are available here. Highlights include: Major improvements to support scientific computing, including torch.linalg, torch.special, and Complex Autograd Major improvements in on-device binary size with Mobile Interpreter Native support for elastic-fault tolerance training through the upstreaming of TorchElastic into PyTorch Core...

Read More