Training LLMs: Data, Objectives & Optimization

Understanding the massive undertaking of pre-training a large language model.

5 days

Topics in this Chapter

The self-supervised tasks used to train LLMs from scratch.

The process of assembling and cleaning massive text corpora.

The optimizers used to train models with billions of parameters.

High-level overview of how training is scaled across many GPUs.