Rethinking Muon Beyond Pretraining: Spectral Failures and High-Pass Remedies for VLA and RLVR Paper • 2605.19282 • Published May 19 • 9
Memory is Reconstructed, Not Retrieved: Graph Memory for LLM Agents Paper • 2606.06036 • Published 22 days ago • 73
Inference-Time Attribute Distribution Alignment for Unconditional Diffusion Paper • 2605.07456 • Published May 8 • 2
EvoArena: Tracking Memory Evolution for Robust LLM Agents in Dynamic Environments Paper • 2606.13681 • Published 15 days ago • 140
One Token per Multimodal Evidence: Latent Memory for Resource-Constrained QA Paper • 2606.10572 • Published 17 days ago • 16
One Token per Multimodal Evidence: Latent Memory for Resource-Constrained QA Paper • 2606.10572 • Published 17 days ago • 16
One Token per Multimodal Evidence: Latent Memory for Resource-Constrained QA Paper • 2606.10572 • Published 17 days ago • 16
Beyond Imitation: Reinforcement Learning for Active Latent Planning Paper • 2601.21598 • Published Jan 29 • 10
Beyond Imitation: Reinforcement Learning for Active Latent Planning Paper • 2601.21598 • Published Jan 29 • 10
Beyond Imitation: Reinforcement Learning for Active Latent Planning Paper • 2601.21598 • Published Jan 29 • 10
SofT-GRPO: Surpassing Discrete-Token LLM Reinforcement Learning via Gumbel-Reparameterized Soft-Thinking Policy Optimization Paper • 2511.06411 • Published Nov 9, 2025 • 18
SofT-GRPO: Surpassing Discrete-Token LLM Reinforcement Learning via Gumbel-Reparameterized Soft-Thinking Policy Optimization Paper • 2511.06411 • Published Nov 9, 2025 • 18 • 2