--- pipeline_tag: text-generation library_name: transformers license: apache-2.0 tags: - mixtral - moe - reasoning --- # Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks This repository contains model checkpoints from the paper [Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks](https://huggingface.co/papers/2508.18672). For more details, including code and evaluation procedures, please refer to the official GitHub repository: [https://github.com/rioyokotalab/optimal-sparsity](https://github.com/rioyokotalab/optimal-sparsity) ## How to cite If you find our work helpful, please feel free to cite the paper. ```bibtex @inproceedings{ nakamura2026optimal, title={Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks}, author={Taishi Nakamura and Satoki Ishikawa and Masaki Kawamura and Takumi Okamoto and Daisuke Nohara and Jun Suzuki and Rio Yokota}, booktitle={The Fourteenth International Conference on Learning Representations}, year={2026}, url={https://openreview.net/forum?id=XFw2EPRUUR} } ```