סער אליעד, הרצאה סמינריונית למגיסטר
יום חמישי, 25.2.2021, 11:00
For password to lecture, please contact: email@example.com
מנחה: Prof. Assaf Schuster
We worked on a particular case of Deep Learning where the model is too large to fit into the memory of a single commodity GPU during training. Such is the case for fine-tuning, an increasingly common technique that leverages transfer learning to dramatically expedite the training of huge, high-quality models. Critically, it holds the potential to make giant state-of-the-art models pre-trained on high-end super-computing-grade systems readily available for users that lack access to such costly resources.
In this seminar, we will present FTPipe, a system that explores a previously unexplored dimension of pipeline model parallelism, making multi-GPU execution of fine-tuning tasks for giant neural networks readily accessible. Our system goes beyond topology limitations of previous pipeline-parallel approaches, efficiently training a new family of models, including the current state-of-the-art. FTPipe achieves up to 3x speedup and state-of-the-art accuracy when fine-tuning giant transformers with billions of parameters.