Sparse Mixture of Experts Model
Sparse Mixture of Experts Model are models with a router component that sends request to a subset of layers. Mixtral 8x7B is an example.
Sparse Mixture of Experts Model are models with a router component that sends request to a subset of layers. Mixtral 8x7B is an example.