DreamVideo-Omni: Omni-Motion Controlled Multi-Subject Video Customization with Latent Identity Reinforcement Learning Paper • 2603.12257 • Published 24 days ago • 31
MIND: Benchmarking Memory Consistency and Action Control in World Models Paper • 2602.08025 • Published Feb 8 • 13
Olaf-World: Orienting Latent Actions for Video World Modeling Paper • 2602.10104 • Published Feb 10 • 27
Code2World: A GUI World Model via Renderable Code Generation Paper • 2602.09856 • Published Feb 10 • 202
ShowUI-π: Flow-based Generative Models as GUI Dexterous Hands Paper • 2512.24965 • Published Dec 31, 2025 • 43
Yume-1.5: A Text-Controlled Interactive World Generation Model Paper • 2512.22096 • Published Dec 26, 2025 • 61
Pretraining Frame Preservation in Autoregressive Video Memory Compression Paper • 2512.23851 • Published Dec 29, 2025 • 25
SpaceTimePilot: Generative Rendering of Dynamic Scenes Across Space and Time Paper • 2512.25075 • Published Dec 31, 2025 • 15
Video Reality Test: Can AI-Generated ASMR Videos fool VLMs and Humans? Paper • 2512.13281 • Published Dec 15, 2025 • 65
X-Humanoid: Robotize Human Videos to Generate Humanoid Videos at Scale Paper • 2512.04537 • Published Dec 4, 2025 • 7
OmniPSD: Layered PSD Generation with Diffusion Transformer Paper • 2512.09247 • Published Dec 10, 2025 • 50
Computer-Use Agents as Judges for Generative User Interface Paper • 2511.15567 • Published Nov 19, 2025 • 54
WEAVE: Unleashing and Benchmarking the In-context Interleaved Comprehension and Generation Paper • 2511.11434 • Published Nov 14, 2025 • 47
Grounding Computer Use Agents on Human Demonstrations Paper • 2511.07332 • Published Nov 10, 2025 • 107
VCode: a Multimodal Coding Benchmark with SVG as Symbolic Visual Representation Paper • 2511.02778 • Published Nov 4, 2025 • 103
UniLumos: Fast and Unified Image and Video Relighting with Physics-Plausible Feedback Paper • 2511.01678 • Published Nov 3, 2025 • 38
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs Paper • 2510.11696 • Published Oct 13, 2025 • 182
Paper2Video: Automatic Video Generation from Scientific Papers Paper • 2510.05096 • Published Oct 6, 2025 • 120
See, Point, Fly: A Learning-Free VLM Framework for Universal Unmanned Aerial Navigation Paper • 2509.22653 • Published Sep 26, 2025 • 25