LIMO: Less is More for Reasoning
Paper • 2502.03387 • Published • 62
This model is Qwen3-4B fine-tuned on the original (unmodified) LIMO dataset. It serves as the baseline for comparison with DSE-cleaned variants.
| Benchmark | Base Qwen3-4B | Original LIMO SFT | DSE LIMO SFT |
|---|---|---|---|
| MATH-500 | 56.6% | 69.6% | 72.8% |
| AIME 2025 | 13.3% | 40.0% | 43.3% |
| AIME 2026 | 36.7% | 46.7% | 53.3% |
| GPQA Diamond | 43.4% | 55.6% | 49.0% |
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("Ciaranshu/decor-qwen3-4b-original")
tokenizer = AutoTokenizer.from_pretrained("Ciaranshu/decor-qwen3-4b-original")