MoVerse: Real-Time Video World Modeling with Panoramic Gaussian Scaffold Paper • 2606.13376 • Published 4 days ago • 11
NVIDIA OmniDreams: Real-Time Generative World Model for Closed-Loop Autonomous Vehicle Simulation Paper • 2606.03159 • Published 13 days ago • 23
Linear Ensembles Wash Away Watermarks: On the Fragility of Distributional Perturbations in LLMs Paper • 2605.30501 • Published 18 days ago • 29
NITP: Next Implicit Token Prediction for LLM Pre-training Paper • 2605.24956 • Published 22 days ago • 35
Where to Look: Can Foundation Models Reach a Target Viewpoint Through Active Exploration? Paper • 2606.01247 • Published 15 days ago • 30
Joint Agent Memory and Exploration Learning via Novelty Signals Paper • 2606.01528 • Published 14 days ago • 15
StreamChar: Long-Horizon Streaming Character Audio-Video Generation with Decoupled Orchestration Paper • 2605.25659 • Published 21 days ago • 16
SkillAdaptor: Self-Adapting Skills for LLM Agents from Trajectories Paper • 2606.01311 • Published 15 days ago • 36
Masking Stale Observations Helps Search Agents -- Until It Doesn't: A Regime Map and Its Mechanism Paper • 2606.00408 • Published 17 days ago • 63
RoboStressBench: Benchmarking VLM Robustness to Physical Visual Stress in Embodied Scenes Paper • 2606.00828 • Published 16 days ago • 10
Speculative Pipeline Decoding: Higher-Accruacy and Zero-Bubble Speculation via Pipeline Parallelism Paper • 2605.30852 • Published 17 days ago • 10
Skill is Not One-Size-Fits-All: Model-Aware Skill Alignment for LLM Agents Paper • 2605.30723 • Published 17 days ago • 16
OpenWebRL: Demystifying Online Multi-turn Reinforcement Learning for Visual Web Agents Paper • 2606.02031 • Published 14 days ago • 20
When Does Multi-Agent RL Improve LLM Workflows? Workflow, Scale, and Policy-Sharing Tradeoffs Paper • 2605.24202 • Published 24 days ago • 17
X-Stream: Exploring MLLMs as Multiplexers for Multi-Stream Understanding Paper • 2606.02482 • Published 14 days ago • 35
VLMs are Good Teachers for Video Reasoning via Adaptive Test-Time Optimization Paper • 2606.02564 • Published 14 days ago • 29
Crafter: A Multi-Agent Harness for Editable Scientific Figure Generation from Diverse Inputs Paper • 2605.30611 • Published 18 days ago • 192
Which Pretraining Paradigm Better Serves Spatial Intelligence? An Empirical Comparison of Vision-Language and Video Generation Models Paper • 2605.28132 • Published 19 days ago • 25