datasets used in SmolLM3 pretraining
Note Stage 1 datasets: 85% Web, 12% Code, 3% Math
Note Stage2 new datasets
Note Stage 3 (decay) new datasets