Machine learning
Mixture of Experts
Mixture of Experts (MoE) is a sparse neural-network architecture, introduced by Shazeer and colleagues in 2017 with the sparsely-gated MoE layer, in which only a subset of expert sub-networks is activated for each input. As seen in models such as Switch Transformer and Mixtral, it holds computation cost fixed even as the total parameter count grows.
MethodMind'de açSoonVideoSoon
Tam yöntemi oku
Members only
Sign inSign in with a free account to read this section.