March 11, 2025
Scaling Recommendation Systems Training to Thousands of GPUs with 2D Sparse Parallelism
At Meta, recommendation systems are the cornerstone of delivering relevant and personalized ads to billions of users globally. Through technologies like PyTorch’s TorchRec, we’ve successfully developed solutions that enable model training across hundreds of GPUs. While these systems have served us well, recent research on scaling laws has revealed a compelling opportunity: we can achieve significantly better model performance by training dramatically larger neural networks.
March 06, 2025
Peak Performance, Minimized Memory: Optimizing torchtune’s performance with torch.compile & Liger Kernel
LinkedIn: Shivam Sahni, Byron Hsu, Yanning Chen Meta: Ankith Gunapal, Evan Smothers
March 05, 2025
Current and New Activation Checkpointing Techniques in PyTorch
As models scale in depth, batch size, and sequence length, etc, activation memory becomes an increasingly significant contributor to the overall memory usage. To help address this, PyTorch provides utilities for activation checkpointing, which reduce the number of saved tensors by recomputing them when needed, trading off memory usage for additional compute.
March 04, 2025
📣 Submit to Speak at PyTorch Conference + Save on Registration
Step into the Future of AI at PyTorch Conference 2025.
February 26, 2025
Accelerating Generative AI with PyTorch: Segment Anything 2 - Fast and furious inference with low latency and fast cold starts
This post is a follow-up to our first entry in the multi-series blog focused on how to accelerate generative AI models with pure, native PyTorch and a focus on latency and elastic scalability. We use torch.compile and torch.export to create highly optimized low latency versions of SAM2 that can be quickly scaled up on new instances.
February 11, 2025
Unlocking the Latest Features in PyTorch 2.6 for Intel Platforms
PyTorch* 2.6 has just been released with a set of exciting new features including torch.compile compatibility with Python 3.13, new security and performance enhancements, and a change in the default parameter for torch.load. PyTorch also announced the deprecation of its official Anaconda channel.
February 05, 2025
Enabling advanced GPU features in PyTorch - Warp Specialization
Meta: Hongtao Yu, Manman Ren, Bert Maher, Shane Nay NVIDIA: Gustav Zhu, Shuhao Jiang