Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2505.05470

E-GRPO: High Entropy Steps Drive Effective Reinforcement Learning for Flow Models

Paper • 2601.00423 • Published Jan 1 • 11
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

Paper • 2601.05242 • Published Jan 8 • 228
FP8-RL: A Practical and Stable Low-Precision Stack for LLM Reinforcement Learning

Paper • 2601.18150 • Published about 1 month ago • 8
DenseGRPO: From Sparse to Dense Reward for Flow Matching Model Alignment

Paper • 2601.20218 • Published 28 days ago • 15

Reinforcement Learning

TempFlow-GRPO: When Timing Matters for GRPO in Flow Models

Paper • 2508.04324 • Published Aug 6, 2025 • 11
Flow-GRPO: Training Flow Matching Models via Online RL

Paper • 2505.05470 • Published May 8, 2025 • 88
FlowRL: Matching Reward Distributions for LLM Reasoning

Paper • 2509.15207 • Published Sep 18, 2025 • 116

Image Generation

OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation

Paper • 2506.07977 • Published Jun 9, 2025 • 41
Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers

Paper • 2506.07986 • Published Jun 9, 2025 • 19
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis

Paper • 2506.06276 • Published Jun 6, 2025 • 26
Aligning Latent Spaces with Flow Priors

Paper • 2506.05240 • Published Jun 5, 2025 • 27

On Path to Multimodal Generalist: General-Level and General-Bench

Paper • 2505.04620 • Published May 7, 2025 • 82
Flow-GRPO: Training Flow Matching Models via Online RL

Paper • 2505.05470 • Published May 8, 2025 • 88

DDT: Decoupled Diffusion Transformer

Paper • 2504.05741 • Published Apr 8, 2025 • 77
DreamID: High-Fidelity and Fast diffusion-based Face Swapping via Triplet ID Group Learning

Paper • 2504.14509 • Published Apr 20, 2025 • 53
Flow-GRPO: Training Flow Matching Models via Online RL

Paper • 2505.05470 • Published May 8, 2025 • 88
Voost: A Unified and Scalable Diffusion Transformer for Bidirectional Virtual Try-On and Try-Off

Paper • 2508.04825 • Published Aug 6, 2025 • 60

UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning

Paper • 2509.02544 • Published Sep 2, 2025 • 125
Sharing is Caring: Efficient LM Post-Training with Collective RL Experience Sharing

Paper • 2509.08721 • Published Sep 10, 2025 • 662
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2, 2025 • 230
A.S.E: A Repository-Level Benchmark for Evaluating Security in AI-Generated Code

Paper • 2508.18106 • Published Aug 25, 2025 • 348

CaRL: Learning Scalable Planning Policies with Simple Rewards

Paper • 2504.17838 • Published Apr 24, 2025 • 4
Flow-GRPO: Training Flow Matching Models via Online RL

Paper • 2505.05470 • Published May 8, 2025 • 88

Scaling Law for Quantization-Aware Training

Paper • 2505.14302 • Published May 20, 2025 • 76
Reward Reasoning Model

Paper • 2505.14674 • Published May 20, 2025 • 37
Qwen3 Technical Report

Paper • 2505.09388 • Published May 14, 2025 • 334
AdaptThink: Reasoning Models Can Learn When to Think

Paper • 2505.13417 • Published May 19, 2025 • 83

Phi-4-reasoning Technical Report

Paper • 2504.21318 • Published Apr 30, 2025 • 54
Flow-GRPO: Training Flow Matching Models via Online RL

Paper • 2505.05470 • Published May 8, 2025 • 88

Latent Space Super-Resolution for Higher-Resolution Image Generation with Diffusion Models

Paper • 2503.18446 • Published Mar 24, 2025 • 12
Unconditional Priors Matter! Improving Conditional Generation of Fine-Tuned Diffusion Models

Paper • 2503.20240 • Published Mar 26, 2025 • 22
BizGen: Advancing Article-level Visual Text Rendering for Infographics Generation

Paper • 2503.20672 • Published Mar 26, 2025 • 14
Beyond Words: Advancing Long-Text Image Generation via Multimodal Autoregressive Models

Paper • 2503.20198 • Published Mar 26, 2025 • 4

E-GRPO: High Entropy Steps Drive Effective Reinforcement Learning for Flow Models

Paper • 2601.00423 • Published Jan 1 • 11
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

Paper • 2601.05242 • Published Jan 8 • 228
FP8-RL: A Practical and Stable Low-Precision Stack for LLM Reinforcement Learning

Paper • 2601.18150 • Published about 1 month ago • 8
DenseGRPO: From Sparse to Dense Reward for Flow Matching Model Alignment

Paper • 2601.20218 • Published 28 days ago • 15

UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning

Paper • 2509.02544 • Published Sep 2, 2025 • 125
Sharing is Caring: Efficient LM Post-Training with Collective RL Experience Sharing

Paper • 2509.08721 • Published Sep 10, 2025 • 662
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2, 2025 • 230
A.S.E: A Repository-Level Benchmark for Evaluating Security in AI-Generated Code

Paper • 2508.18106 • Published Aug 25, 2025 • 348

Reinforcement Learning

TempFlow-GRPO: When Timing Matters for GRPO in Flow Models

Paper • 2508.04324 • Published Aug 6, 2025 • 11
Flow-GRPO: Training Flow Matching Models via Online RL

Paper • 2505.05470 • Published May 8, 2025 • 88
FlowRL: Matching Reward Distributions for LLM Reasoning

Paper • 2509.15207 • Published Sep 18, 2025 • 116

CaRL: Learning Scalable Planning Policies with Simple Rewards

Paper • 2504.17838 • Published Apr 24, 2025 • 4
Flow-GRPO: Training Flow Matching Models via Online RL

Paper • 2505.05470 • Published May 8, 2025 • 88

Image Generation

OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation

Paper • 2506.07977 • Published Jun 9, 2025 • 41
Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers

Paper • 2506.07986 • Published Jun 9, 2025 • 19
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis

Paper • 2506.06276 • Published Jun 6, 2025 • 26
Aligning Latent Spaces with Flow Priors

Paper • 2506.05240 • Published Jun 5, 2025 • 27

Scaling Law for Quantization-Aware Training

Paper • 2505.14302 • Published May 20, 2025 • 76
Reward Reasoning Model

Paper • 2505.14674 • Published May 20, 2025 • 37
Qwen3 Technical Report

Paper • 2505.09388 • Published May 14, 2025 • 334
AdaptThink: Reasoning Models Can Learn When to Think

Paper • 2505.13417 • Published May 19, 2025 • 83

On Path to Multimodal Generalist: General-Level and General-Bench

Paper • 2505.04620 • Published May 7, 2025 • 82
Flow-GRPO: Training Flow Matching Models via Online RL

Paper • 2505.05470 • Published May 8, 2025 • 88

Phi-4-reasoning Technical Report

Paper • 2504.21318 • Published Apr 30, 2025 • 54
Flow-GRPO: Training Flow Matching Models via Online RL

Paper • 2505.05470 • Published May 8, 2025 • 88

DDT: Decoupled Diffusion Transformer

Paper • 2504.05741 • Published Apr 8, 2025 • 77
DreamID: High-Fidelity and Fast diffusion-based Face Swapping via Triplet ID Group Learning

Paper • 2504.14509 • Published Apr 20, 2025 • 53
Flow-GRPO: Training Flow Matching Models via Online RL

Paper • 2505.05470 • Published May 8, 2025 • 88
Voost: A Unified and Scalable Diffusion Transformer for Bidirectional Virtual Try-On and Try-Off

Paper • 2508.04825 • Published Aug 6, 2025 • 60

Latent Space Super-Resolution for Higher-Resolution Image Generation with Diffusion Models

Paper • 2503.18446 • Published Mar 24, 2025 • 12
Unconditional Priors Matter! Improving Conditional Generation of Fine-Tuned Diffusion Models

Paper • 2503.20240 • Published Mar 26, 2025 • 22
BizGen: Advancing Article-level Visual Text Rendering for Infographics Generation

Paper • 2503.20672 • Published Mar 26, 2025 • 14
Beyond Words: Advancing Long-Text Image Generation via Multimodal Autoregressive Models

Paper • 2503.20198 • Published Mar 26, 2025 • 4

Previous
1
2
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs