-
LLM Pruning and Distillation in Practice: The Minitron Approach
Paper • 2408.11796 • Published • 58 -
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering
Paper • 2408.09174 • Published • 52 -
To Code, or Not To Code? Exploring Impact of Code in Pre-training
Paper • 2408.10914 • Published • 45 -
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications
Paper • 2408.11878 • Published • 63
Collections
Discover the best community collections!
Collections including paper arxiv:2502.09056
-
Evaluation of OpenAI o1: Opportunities and Challenges of AGI
Paper • 2409.18486 • Published -
Adapting Language-Specific LLMs to a Reasoning Model in One Day via Model Merging -- An Open Recipe
Paper • 2502.09056 • Published • 31 -
Competitive Programming with Large Reasoning Models
Paper • 2502.06807 • Published • 69
-
typhoon-ai/llama3.2-typhoon2-t1-3b-research-preview
Text Generation • 3B • Updated • 84 • 6 -
typhoon-ai/llama3.1-typhoon2-deepseek-r1-70b-preview
Text Generation • 71B • Updated • 42 • 13 -
typhoon-ai/llama3.2-typhoon2-t1-3b-research-preview-mlx-4bit
Text Generation • 0.5B • Updated • 22 • 1 -
Typhoon T1: An Open Thai Reasoning Model
Paper • 2502.09042 • Published • 16
-
RL + Transformer = A General-Purpose Problem Solver
Paper • 2501.14176 • Published • 28 -
Towards General-Purpose Model-Free Reinforcement Learning
Paper • 2501.16142 • Published • 31 -
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Paper • 2501.17161 • Published • 124 -
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization
Paper • 2412.12098 • Published • 4
-
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search
Paper • 2412.18319 • Published • 39 -
Token-Budget-Aware LLM Reasoning
Paper • 2412.18547 • Published • 46 -
Efficiently Serving LLM Reasoning Programs with Certaindex
Paper • 2412.20993 • Published • 36 -
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners
Paper • 2412.17256 • Published • 47
-
LLM Pruning and Distillation in Practice: The Minitron Approach
Paper • 2408.11796 • Published • 58 -
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering
Paper • 2408.09174 • Published • 52 -
To Code, or Not To Code? Exploring Impact of Code in Pre-training
Paper • 2408.10914 • Published • 45 -
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications
Paper • 2408.11878 • Published • 63
-
Evaluation of OpenAI o1: Opportunities and Challenges of AGI
Paper • 2409.18486 • Published -
Adapting Language-Specific LLMs to a Reasoning Model in One Day via Model Merging -- An Open Recipe
Paper • 2502.09056 • Published • 31 -
Competitive Programming with Large Reasoning Models
Paper • 2502.06807 • Published • 69
-
RL + Transformer = A General-Purpose Problem Solver
Paper • 2501.14176 • Published • 28 -
Towards General-Purpose Model-Free Reinforcement Learning
Paper • 2501.16142 • Published • 31 -
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Paper • 2501.17161 • Published • 124 -
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization
Paper • 2412.12098 • Published • 4
-
typhoon-ai/llama3.2-typhoon2-t1-3b-research-preview
Text Generation • 3B • Updated • 84 • 6 -
typhoon-ai/llama3.1-typhoon2-deepseek-r1-70b-preview
Text Generation • 71B • Updated • 42 • 13 -
typhoon-ai/llama3.2-typhoon2-t1-3b-research-preview-mlx-4bit
Text Generation • 0.5B • Updated • 22 • 1 -
Typhoon T1: An Open Thai Reasoning Model
Paper • 2502.09042 • Published • 16
-
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search
Paper • 2412.18319 • Published • 39 -
Token-Budget-Aware LLM Reasoning
Paper • 2412.18547 • Published • 46 -
Efficiently Serving LLM Reasoning Programs with Certaindex
Paper • 2412.20993 • Published • 36 -
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners
Paper • 2412.17256 • Published • 47