Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections Paper • 2603.12180 • Published 4 days ago • 56
VidEoMT: Your ViT is Secretly Also a Video Segmentation Model Paper • 2602.17807 • Published 25 days ago • 6
Causal-JEPA: Learning World Models through Object-Level Latent Interventions Paper • 2602.11389 • Published Feb 11 • 6
SE-Bench: Benchmarking Self-Evolution with Knowledge Internalization Paper • 2602.04811 • Published Feb 4 • 2
UPLiFT: Efficient Pixel-Dense Feature Upsampling with Local Attenders Paper • 2601.17950 • Published Jan 25 • 4
UM-Text: A Unified Multimodal Model for Image Understanding Paper • 2601.08321 • Published Jan 13 • 11
TCAndon-Router: Adaptive Reasoning Router for Multi-Agent Collaboration Paper • 2601.04544 • Published Jan 8 • 6
ResTok: Learning Hierarchical Residuals in 1D Visual Tokenizers for Autoregressive Image Generation Paper • 2601.03955 • Published Jan 7 • 3
FlowBlending: Stage-Aware Multi-Model Sampling for Fast and High-Fidelity Video Generation Paper • 2512.24724 • Published Dec 31, 2025 • 8
Dream2Flow: Bridging Video Generation and Open-World Manipulation with 3D Object Flow Paper • 2512.24766 • Published Dec 31, 2025 • 9
CASA: Cross-Attention via Self-Attention for Efficient Vision-Language Fusion Paper • 2512.19535 • Published Dec 22, 2025 • 12
What matters for Representation Alignment: Global Information or Spatial Structure? Paper • 2512.10794 • Published Dec 11, 2025 • 9
ThreadWeaver: Adaptive Threading for Efficient Parallel Reasoning in Language Models Paper • 2512.07843 • Published Nov 24, 2025 • 22
BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution Paper • 2510.08697 • Published Oct 9, 2025 • 39