Back to Media
Training Compute-Optimal Large Language Models (Chinchilla)
Hoffmann, Borgeaud, Mensch, Buchatskaya, Cai, Rutherford, et al.
Notes
Showed most LLMs were undertrained. Optimal ratio of data to parameters.