CarePilot: A Multi-Agent Framework for Long-Horizon Computer Task Automation in Healthcare Paper • 2603.24157 • Published 1 day ago • 8
Toward Physically Consistent Driving Video World Models under Challenging Trajectories Paper • 2603.24506 • Published about 19 hours ago • 3
When Models Judge Themselves: Unsupervised Self-Evolution for Multimodal Reasoning Paper • 2603.21289 • Published 4 days ago • 12
GameplayQA: A Benchmarking Framework for Decision-Dense POV-Synced Multi-Video Understanding of 3D Virtual Agents Paper • 2603.24329 • Published about 21 hours ago • 13
Repurposing Geometric Foundation Models for Multi-view Diffusion Paper • 2603.22275 • Published 3 days ago • 40
Reasoning or Rhetoric? An Empirical Analysis of Moral Reasoning Explanations in Large Language Models Paper • 2603.21854 • Published 3 days ago • 2
Session Risk Memory (SRM): Temporal Authorization for Deterministic Pre-Execution Safety Gates Paper • 2603.22350 • Published 4 days ago • 1
Attend Before Attention: Efficient and Scalable Video Understanding via Autoregressive Gazing Paper • 2603.12254 • Published 14 days ago • 18
Rethinking Token-Level Policy Optimization for Multimodal Chain-of-Thought Paper • 2603.22847 • Published 2 days ago • 20
FinToolBench: Evaluating LLM Agents for Real-World Financial Tool Use Paper • 2603.08262 • Published 17 days ago • 42
Speed by Simplicity: A Single-Stream Architecture for Fast Audio-Video Generative Foundation Model Paper • 2603.21986 • Published 3 days ago • 101
SegviGen: Repurposing 3D Generative Model for Part Segmentation Paper • 2603.16869 • Published 9 days ago • 18
CurveStream: Boosting Streaming Video Understanding in MLLMs via Curvature-Aware Hierarchical Visual Memory Management Paper • 2603.19571 • Published 6 days ago • 2
HopChain: Multi-Hop Data Synthesis for Generalizable Vision-Language Reasoning Paper • 2603.17024 • Published 9 days ago • 102
Astrolabe: Steering Forward-Process Reinforcement Learning for Distilled Autoregressive Video Models Paper • 2603.17051 • Published 9 days ago • 95
TerraScope: Pixel-Grounded Visual Reasoning for Earth Observation Paper • 2603.19039 • Published 7 days ago • 46
AgentDS Technical Report: Benchmarking the Future of Human-AI Collaboration in Domain-Specific Data Science Paper • 2603.19005 • Published 7 days ago • 5