TIDE: Proactive Multi-Problem Discovery via Template-Guided Iteration Paper • 2606.04743 • Published 3 days ago • 35
LaRA: Layer-wise Representation Analysis for Detecting Data Contamination in RL Post-Training Paper • 2605.29888 • Published 9 days ago • 34
OmniRetrieval: Unified Retrieval across Heterogeneous Knowledge Sources Paper • 2605.29250 • Published 9 days ago • 76
Learn from Weaknesses: Automated Domain Specialization for Small Computer-Use Agents Paper • 2605.28775 • Published 10 days ago • 38
Agent Explorative Policy Optimization for Multimodal Agentic Reasoning Paper • 2605.28774 • Published 10 days ago • 87
It Takes Two: Complementary Self-Distillation for Contextual Integrity in LLMs Paper • 2605.20258 • Published 19 days ago • 30
Nudging Beyond the Comfort Zone: Efficient Strategy-Guided Exploration for RLVR Paper • 2605.15726 • Published 22 days ago • 34
On Training Large Language Models for Long-Horizon Tasks: An Empirical Study of Horizon Length Paper • 2605.02572 • Published May 4 • 2
Co-Evolving LLM Decision and Skill Bank Agents for Long-Horizon Tasks Paper • 2604.20987 • Published Apr 22 • 21
SWE-chat: Coding Agent Interactions From Real Users in the Wild Paper • 2604.20779 • Published Apr 22 • 16
SkillLearnBench: Benchmarking Continual Learning Methods for Agent Skill Generation on Real-World Tasks Paper • 2604.20087 • Published Apr 22 • 15
Memory Transfer Learning: How Memories are Transferred Across Domains in Coding Agents Paper • 2604.14004 • Published Apr 15 • 30
LEGO-Eval: Towards Fine-Grained Evaluation on Synthesizing 3D Embodied Environments with Tool Augmentation Paper • 2511.03001 • Published Nov 4, 2025 • 49
Safe and Scalable Web Agent Learning via Recreated Websites Paper • 2603.10505 • Published Mar 11 • 27