Representation Forcing for Bottleneck-Free Unified Multimodal Models Paper • 2605.31604 • Published 9 days ago • 57
SANA-Streaming: Real-time Streaming Video Editing with Hybrid Diffusion Transformer Paper • 2605.30409 • Published 10 days ago • 36
minWM: A Full-Stack Open-Source Framework for Real-Time Interactive Video World Models Paper • 2605.30263 • Published 10 days ago • 57
From Pixels to Words -- Towards Native One-Vision Models at Scale Paper • 2605.28820 • Published 11 days ago • 73
Soap2Soap: Long Cinematic Video Remaking via Multi-Agent Collaboration Paper • 2605.17423 • Published 21 days ago • 33
SkillEvolBench: Benchmarking the Evolution from Episodic Experience to Procedural Skills Paper • 2605.24117 • Published 16 days ago • 21
SkillEvolBench: Benchmarking the Evolution from Episodic Experience to Procedural Skills Paper • 2605.24117 • Published 16 days ago • 21
Lens: Rethinking Training Efficiency for Foundational Text-to-Image Models Paper • 2605.21573 • Published 18 days ago • 108
GenEvolve: Self-Evolving Image Generation Agents via Tool-Orchestrated Visual Experience Distillation Paper • 2605.21605 • Published 18 days ago • 13
Lance: Unified Multimodal Modeling by Multi-Task Synergy Paper • 2605.18678 • Published 20 days ago • 78
LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation Paper • 2605.18739 • Published 20 days ago • 112
NEWTON: Agentic Planning for Physically Grounded Video Generation Paper • 2605.18396 • Published 20 days ago • 22
FashionChameleon: Towards Real-Time and Interactive Human-Garment Video Customization Paper • 2605.15824 • Published 23 days ago • 64
Causal Forcing++: Scalable Few-Step Autoregressive Diffusion Distillation for Real-Time Interactive Video Generation Paper • 2605.15141 • Published 24 days ago • 93