Collections
Discover the best community collections!
Collections including paper arxiv:2508.14444
-
Test-Time Scaling with Reflective Generative Model
Paper • 2507.01951 • Published • 108 -
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach
Paper • 2502.05171 • Published • 152 -
Autoregressive Diffusion Models
Paper • 2110.02037 • Published -
EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling
Paper • 2502.09509 • Published • 8
-
Technologies on Effectiveness and Efficiency: A Survey of State Spaces Models
Paper • 2503.11224 • Published • 28 -
Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers
Paper • 2503.11579 • Published • 21 -
Text-conditioned State Space Model For Domain-generalized Change Detection Visual Question Answering
Paper • 2508.08974 • Published -
NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Paper • 2508.14444 • Published • 43
-
A Benchmark for Learning to Translate a New Language from One Grammar Book
Paper • 2309.16575 • Published • 1 -
Can LLMs Really Learn to Translate a Low-Resource Language from One Grammar Book?
Paper • 2409.19151 • Published -
TinyStories: How Small Can Language Models Be and Still Speak Coherent English?
Paper • 2305.07759 • Published • 39 -
TinyLlama: An Open-Source Small Language Model
Paper • 2401.02385 • Published • 95
-
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper • 2508.06471 • Published • 206 -
NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Paper • 2508.14444 • Published • 43 -
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities
Paper • 2507.06261 • Published • 67 -
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention
Paper • 2506.13585 • Published • 273
-
Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis
Paper • 2505.13227 • Published • 45 -
facebook/natural_reasoning
Viewer • Updated • 1.15M • 1.48k • 551 -
nvidia/OpenMathReasoning
Viewer • Updated • 5.68M • 14.6k • 444 -
Search Arena: Analyzing Search-Augmented LLMs
Paper • 2506.05334 • Published • 18
-
FLAME: Factuality-Aware Alignment for Large Language Models
Paper • 2405.01525 • Published • 29 -
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data
Paper • 2405.14333 • Published • 44 -
Transformers Can Do Arithmetic with the Right Embeddings
Paper • 2405.17399 • Published • 54 -
EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture
Paper • 2405.18991 • Published • 12
-
A Benchmark for Learning to Translate a New Language from One Grammar Book
Paper • 2309.16575 • Published • 1 -
Can LLMs Really Learn to Translate a Low-Resource Language from One Grammar Book?
Paper • 2409.19151 • Published -
TinyStories: How Small Can Language Models Be and Still Speak Coherent English?
Paper • 2305.07759 • Published • 39 -
TinyLlama: An Open-Source Small Language Model
Paper • 2401.02385 • Published • 95
-
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper • 2508.06471 • Published • 206 -
NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Paper • 2508.14444 • Published • 43 -
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities
Paper • 2507.06261 • Published • 67 -
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention
Paper • 2506.13585 • Published • 273
-
Test-Time Scaling with Reflective Generative Model
Paper • 2507.01951 • Published • 108 -
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach
Paper • 2502.05171 • Published • 152 -
Autoregressive Diffusion Models
Paper • 2110.02037 • Published -
EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling
Paper • 2502.09509 • Published • 8
-
Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis
Paper • 2505.13227 • Published • 45 -
facebook/natural_reasoning
Viewer • Updated • 1.15M • 1.48k • 551 -
nvidia/OpenMathReasoning
Viewer • Updated • 5.68M • 14.6k • 444 -
Search Arena: Analyzing Search-Augmented LLMs
Paper • 2506.05334 • Published • 18
-
Technologies on Effectiveness and Efficiency: A Survey of State Spaces Models
Paper • 2503.11224 • Published • 28 -
Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers
Paper • 2503.11579 • Published • 21 -
Text-conditioned State Space Model For Domain-generalized Change Detection Visual Question Answering
Paper • 2508.08974 • Published -
NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Paper • 2508.14444 • Published • 43
-
FLAME: Factuality-Aware Alignment for Large Language Models
Paper • 2405.01525 • Published • 29 -
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data
Paper • 2405.14333 • Published • 44 -
Transformers Can Do Arithmetic with the Right Embeddings
Paper • 2405.17399 • Published • 54 -
EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture
Paper • 2405.18991 • Published • 12