Updated open-sci-ref baselines. Re-training without dropout. Re-training on DCLM, FineWeb-Edu, Nemotron, HPLT-2, Pile. Further ref datasets included.
AI & ML interests
Researching and building foundation models with improved generalization and reasoning. LAION & friends spin-off for open-sourcing foundation models with strong generalization and reasoning , including datasets necessary for their creation, to serve as common open, reproducible grounds for further research experiments.
Recent Activity
View all activity
models 116
open-sci/open-sci-ref-v0.02-1.7b-nemotron-hq-300B-4096-long_sft_16k
Feature Extraction • 2B • Updated • 15
open-sci/sft__ot30k_SmolLM2-1.7B-Instruct-16k
Text Generation • 2B • Updated • 299
open-sci/sft__ot30k_SmolLM2-1.7B-16k-SFT-Tulu3-decontaminated
Text Generation • 2B • Updated • 302
open-sci/sft__ot30k_Qwen3-1.7B-Base-SFT-Tulu3-decontaminated
Text Generation • 2B • Updated • 295
open-sci/sft__ot30k_Qwen3-1.7B-Base-DPO-Tulu3-decontaminated
Text Generation • 2B • Updated • 296
open-sci/sft__ot30k_Qwen2.5-1.5B-SFT-Tulu3-decontaminated
Text Generation • 2B • Updated • 292
open-sci/sft__ot30k_Qwen2.5-1.5B-DPO-Tulu3-decontaminated
Text Generation • 2B • Updated • 292
open-sci/sft__ot30k_open-sci-ref-v0.02-1.7b-nemotron-hq-300B-16k-SFT-Tulu3-decontaminated
Feature Extraction • 2B • Updated • 12
open-sci/sft__ot30k_open-sci-ref-v0.02-1.7b-nemotron-hq-300B-16k-DPO-Tulu3-decontaminated
Feature Extraction • 2B • Updated • 7
open-sci/sft__ot30k_open-sci-ref-v0.02-1.7b-fineweb-edu-1.4t-300B-4096-longsft_16k-SFT-Tulu3
Feature Extraction • 2B • Updated • 11