Collections
Discover the best community collections!
Collections including paper arxiv:2512.05150
-
Guided Self-Evolving LLMs with Minimal Human Supervision
Paper • 2512.02472 • Published • 55 -
DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search
Paper • 2509.25454 • Published • 146 -
Video Reasoning without Training
Paper • 2510.17045 • Published • 8 -
Agent Learning via Early Experience
Paper • 2510.08558 • Published • 273
-
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning
Paper • 2509.07980 • Published • 104 -
Robot Learning from a Physical World Model
Paper • 2511.07416 • Published • 32 -
MathSE: Improving Multimodal Mathematical Reasoning via Self-Evolving Iterative Reflection and Reward-Guided Fine-Tuning
Paper • 2511.06805 • Published • 13 -
GigaEvo: An Open Source Optimization Framework Powered By LLMs And Evolution Algorithms
Paper • 2511.17592 • Published • 119
-
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 75 -
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective
Paper • 2505.15045 • Published • 55 -
Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding
Paper • 2505.16990 • Published • 22 -
D-AR: Diffusion via Autoregressive Models
Paper • 2505.23660 • Published • 34
-
MambaVision: A Hybrid Mamba-Transformer Vision Backbone
Paper • 2407.08083 • Published • 32 -
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model
Paper • 2408.11039 • Published • 63 -
The Mamba in the Llama: Distilling and Accelerating Hybrid Models
Paper • 2408.15237 • Published • 42 -
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think
Paper • 2409.11355 • Published • 30
-
ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding
Paper • 2512.13586 • Published • 93 -
TwinFlow: Realizing One-step Generation on Large Models with Self-adversarial Flows
Paper • 2512.05150 • Published • 76 -
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times
Paper • 2512.16093 • Published • 95 -
PixelGen: Pixel Diffusion Beats Latent Diffusion with Perceptual Loss
Paper • 2602.02493 • Published • 42
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 50 • 98 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 38 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 88
-
Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models
Paper • 2410.02740 • Published • 54 -
From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging
Paper • 2410.01215 • Published • 39 -
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models
Paper • 2409.17146 • Published • 121 -
EuroLLM: Multilingual Language Models for Europe
Paper • 2409.16235 • Published • 29
-
Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis
Paper • 2401.09048 • Published • 10 -
Improving fine-grained understanding in image-text pre-training
Paper • 2401.09865 • Published • 18 -
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Paper • 2401.10891 • Published • 62 -
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild
Paper • 2401.13627 • Published • 78
-
ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding
Paper • 2512.13586 • Published • 93 -
TwinFlow: Realizing One-step Generation on Large Models with Self-adversarial Flows
Paper • 2512.05150 • Published • 76 -
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times
Paper • 2512.16093 • Published • 95 -
PixelGen: Pixel Diffusion Beats Latent Diffusion with Perceptual Loss
Paper • 2602.02493 • Published • 42
-
Guided Self-Evolving LLMs with Minimal Human Supervision
Paper • 2512.02472 • Published • 55 -
DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search
Paper • 2509.25454 • Published • 146 -
Video Reasoning without Training
Paper • 2510.17045 • Published • 8 -
Agent Learning via Early Experience
Paper • 2510.08558 • Published • 273
-
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning
Paper • 2509.07980 • Published • 104 -
Robot Learning from a Physical World Model
Paper • 2511.07416 • Published • 32 -
MathSE: Improving Multimodal Mathematical Reasoning via Self-Evolving Iterative Reflection and Reward-Guided Fine-Tuning
Paper • 2511.06805 • Published • 13 -
GigaEvo: An Open Source Optimization Framework Powered By LLMs And Evolution Algorithms
Paper • 2511.17592 • Published • 119
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 50 • 98 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 38 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 88
-
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 75 -
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective
Paper • 2505.15045 • Published • 55 -
Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding
Paper • 2505.16990 • Published • 22 -
D-AR: Diffusion via Autoregressive Models
Paper • 2505.23660 • Published • 34
-
Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models
Paper • 2410.02740 • Published • 54 -
From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging
Paper • 2410.01215 • Published • 39 -
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models
Paper • 2409.17146 • Published • 121 -
EuroLLM: Multilingual Language Models for Europe
Paper • 2409.16235 • Published • 29
-
MambaVision: A Hybrid Mamba-Transformer Vision Backbone
Paper • 2407.08083 • Published • 32 -
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model
Paper • 2408.11039 • Published • 63 -
The Mamba in the Llama: Distilling and Accelerating Hybrid Models
Paper • 2408.15237 • Published • 42 -
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think
Paper • 2409.11355 • Published • 30
-
Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis
Paper • 2401.09048 • Published • 10 -
Improving fine-grained understanding in image-text pre-training
Paper • 2401.09865 • Published • 18 -
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Paper • 2401.10891 • Published • 62 -
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild
Paper • 2401.13627 • Published • 78