Over 10 years we help companies reach their financial and branding goals. Engitech is a values-driven technology agency dedicated.

Gallery

Contacts

411 University St, Seattle, USA

engitech@oceanthemes.net

+1 -800-456-478-23

Development Technology

DiPaCo: Towards a New Paradigm of Distributed AI Training by Google DeepMind



The modern AI relies on large-scale models with trillions of parameters. This inevitably poses the question of how to make the training and deployment of large-scale AI systems more efficient, performant, and cost-effective.

Researchers from Google DeepMind propose a new modular paradigm for distributed AI training called Distributed Paths Composition (DiPaCo). DiPaCo is an architecture and training algorithm that aims to enable better scaling.

The high-level idea is to distribute computation by path. In this context, a ‘path’ refers to a sequence of modules that define an input-output function. Paths are small relative to the entire model and require only a handful of tightly connected devices to train or evaluate.

In this lecture, Arthur Douillard, Senior Research Scientist at Google DeepMind, shares with the @BuzzRobot community the technical details of DiPaCo and how it potentially can change the way large-scale AI systems are trained in the future.

Timestamps:
0:00 Introduction
0:45 Google DeepMind’s vision of distributed training
2:57 Building blocks towards DiPaCo
11:33 Building blocks towards DiPaCo: DiLoCo – a low communication distributed training optimization
18:41 Introducing DiPaCo: Distributed Paths Composition
22:37 Q&A session

Social Links:
Newsletter: https://buzzrobot.substack.com/
X: https://x.com/sopharicks
Slack: https://buzzrobot.slack.com/join/shared_invite/zt-1zsh7k8pd-iMu_M8bUxIK3pOJgqJgCRQ#/shared-invite/email

source

Author

MQ

Leave a comment

Your email address will not be published. Required fields are marked *