REAP the Experts: Why Pruning Prevails for One-Shot MoE compression
Paper • 2510.13999 • Published • 19
Support this work: donate.sybilsolutions.ai
REAP surfaces: GLM | MiniMax | Qwen | Gemma | Paper | Code | PR17 | Cerebras Collection
0xSero/Qwen3.5-264B-REAPQwen/Qwen3.5-397B-A17Bpruned34%reap0xSeroSybil SolutionsREAP PR170xSero/home/ubuntu/qwen397-full/observer-calibv1/qwen397-pr17-calibv1-23k-16k-observer-state.raw.pt/home/ubuntu/qwen397-full/observer-calibv1/qwen397-pr17-calibv1-23k-16k-detail-state.raw.ptNo benchmark summary was found.
No custom stress summary was found.
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("0xSero/Qwen3.5-264B-REAP", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("0xSero/Qwen3.5-264B-REAP", trust_remote_code=True)
Thank you for the kind sponsors, wouldn't be possible without them: