Least-Loaded Expert Parallelism: Load Balancing An Imbalanced Mixture-of-Experts
Xuan-Phi Nguyen, Shrey Pandit, Austin Xu, Caiming Xiong, and Shafiq Joty. In International Conference on Machine Learning (ICML-26) 2026.
PDF BibTex Slides