Back to Media

Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer

Shazeer, Mirhoseini, Maziarz, Davis, Le, Hinton, Dean
paper completed ai-ml

Notes

Mixture of Experts with learned gating. Conditional computation at scale.