February 01, 2024
What's New in PyTorch Documentation
Greetings to the PyTorch community! Here is a quick update on PyTorch docs.
January 30, 2024
PyTorch 2.2: FlashAttention-v2 integration, AOTInductor
We are excited to announce the release of PyTorch® 2.2 (release note)! PyTorch 2.2 offers ~2x performance improvements to scaled_dot_product_attention via FlashAttention-v2 integration, as well as AOTInductor, a new ahead-of-time compilation and deployment tool built for non-python server-side deployments.
January 23, 2024
Accelerating Generative AI with PyTorch IV: Seamless M4T, fast
This post is the fourth part of a multi-series blog focused on how to accelerate generative AI models with pure, native PyTorch. To skip to the code, check out our github (seamless_communication, fairseq2). We are excited to share a breadth of newly released PyTorch performance features alongside practical examples to see how far we can push PyTorch native performance. In part one, we showed how to accelerate Segment Anything over 8x using only pure, native PyTorch. In part two, we showed how...
January 18, 2024
Accelerate PyTorch Models Using Quantization Techniques with Intel Extension for PyTorch
Overview
January 10, 2024
Finetune LLMs on your own consumer hardware using tools from PyTorch and Hugging Face ecosystem
We demonstrate how to finetune a 7B parameter model on a typical consumer GPU (NVIDIA T4 16GB) with LoRA and tools from the PyTorch and Hugging Face ecosystem with complete reproducible Google Colab notebook.