Zone of Proximal Policy Optimization: Teacher in Prompts, Not Gradients Paper • 2606.18216 • Published 9 days ago • 60
Agent Explorative Policy Optimization for Multimodal Agentic Reasoning Paper • 2605.28774 • Published 29 days ago • 93
TactAlign: Human-to-Robot Policy Transfer via Tactile Alignment Paper • 2602.13579 • Published Feb 14 • 11
Why Far Looks Up: Probing Spatial Representation in Vision-Language Models Paper • 2605.30161 • Published 28 days ago • 60