TriAttention: Efficient Long Reasoning with Trigonometric KV Compression Paper • 2604.04921 • Published 2 days ago • 64
Do LLMs "Feel"? Emotion Circuits Discovery and Control Paper • 2510.11328 • Published Oct 13, 2025 • 6
DynaVid: Learning to Generate Highly Dynamic Videos using Synthetic Motion Data Paper • 2604.01666 • Published 6 days ago • 8
Sommelier: Scalable Open Multi-turn Audio Pre-processing for Full-duplex Speech Language Models Paper • 2603.25750 • Published 19 days ago • 35
Representation Alignment for Just Image Transformers is not Easier than You Think Paper • 2603.14366 • Published 23 days ago • 13
InCoder-32B: Code Foundation Model for Industrial Scenarios Paper • 2603.16790 • Published 21 days ago • 307
Grounding World Simulation Models in a Real-World Metropolis Paper • 2603.15583 • Published 22 days ago • 153
User-Oriented Multi-Turn Dialogue Generation with Tool Use at scale Paper • 2601.08225 • Published Jan 13 • 53
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization Paper • 2601.05242 • Published Jan 8 • 230