Droplet3D: Commonsense Priors from Videos Facilitate 3D Generation Paper • 2508.20470 • Published Aug 28, 2025 • 75
DropletVideo: A Dataset and Approach to Explore Integral Spatio-Temporal Consistent Video Generation Paper • 2503.06053 • Published Mar 8, 2025 • 138
Can Large Reasoning Models do Analogical Reasoning under Perceptual Uncertainty? Paper • 2503.11207 • Published Mar 14, 2025 • 6
ARMOR v0.1: Empowering Autoregressive Multimodal Understanding Model with Interleaved Multimodal Generation via Asymmetric Synergy Paper • 2503.06542 • Published Mar 9, 2025 • 7
Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers Paper • 2503.11579 • Published Mar 14, 2025 • 21
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey Paper • 2503.12605 • Published Mar 16, 2025 • 35
PLADIS: Pushing the Limits of Attention in Diffusion Models at Inference Time by Leveraging Sparsity Paper • 2503.07677 • Published Mar 10, 2025 • 86