Dataset
updated
DeepDistill: Enhancing LLM Reasoning Capabilities via Large-Scale
Difficulty-Graded Data Training
Paper
• 2504.17565
• Published • 2
Viewer
• Updated • 896k • 3.57k
• 177
PrimeIntellect/synthetic-code-understanding
Viewer
• Updated • 60.6k • 51
• 20
Go to Zero: Towards Zero-shot Motion Generation with Million-scale Data
Paper
• 2507.07095
• Published • 56
VeriGUI: Verifiable Long-Chain GUI Dataset
Paper
• 2508.04026
• Published • 164
Viewer
• Updated • 408k • 1.7k
• 46
Viewer
• Updated • 141M • 4.02k
• 189
jupyter-agent/jupyter-agent-dataset
Viewer
• Updated • 95.8k • 993
• 157
Viewer
• Updated • 24.2M • 126k
• 474
PersonaX: Multimodal Datasets with LLM-Inferred Behavior Traits
Paper
• 2509.11362
• Published • 5
ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform
Data
Paper
• 2509.15221
• Published • 111
MARS2 2025 Challenge on Multimodal Reasoning: Datasets, Methods,
Results, Discussion, and Outlook
Paper
• 2509.14142
• Published • 10
MMAT-1M: A Large Reasoning Dataset for Multimodal Agent Tuning
Paper
• 2507.21924
• Published • 1
Benchmark
• Updated • 731 • 483k
• 55
Updated • 3k
• 199
Hierarchical Dataset Selection for High-Quality Data Sharing
Paper
• 2512.10952
• Published • 2
FiNERweb: Datasets and Artifacts for Scalable Multilingual Named Entity Recognition
Paper
• 2512.13884
• Published • 15
RedBench: A Universal Dataset for Comprehensive Red Teaming of Large Language Models
Paper
• 2601.03699
• Published • 8
Extreme Multi-Label Skill Extraction Training using Large Language
Models
Paper
• 2307.10778
• Published
Viewer
• Updated • 1.9k • 1.41k
• 139
DeepVision-103K: A Visually Diverse, Broad-Coverage, and Verifiable Mathematical Dataset for Multimodal Reasoning
Paper
• 2602.16742
• Published • 12
SWE-rebench V2: Language-Agnostic SWE Task Collection at Scale
Paper
• 2602.23866
• Published • 87