REASONEDIT: Towards Reasoning-Enhanced Image Editing Models Paper • 2511.22625 • Published 11 days ago • 45
Motion2Motion: Cross-topology Motion Transfer with Sparse Correspondence Paper • 2508.13139 • Published Aug 18 • 4
UniVerse-1: Unified Audio-Video Generation via Stitching of Experts Paper • 2509.06155 • Published Sep 7 • 13
Progressive Disentangled Representation Learning for Fine-Grained Controllable Talking Head Synthesis Paper • 2211.14506 • Published Nov 26, 2022 • 1
LazyDrag: Enabling Stable Drag-Based Editing on Multi-Modal Diffusion Transformers via Explicit Correspondence Paper • 2509.12203 • Published Sep 15 • 19
ConsistEdit: Highly Consistent and Precise Training-free Visual Editing Paper • 2510.17803 • Published Oct 20 • 13
Talking Head Generation with Probabilistic Audio-to-Visual Diffusion Priors Paper • 2212.04248 • Published Dec 7, 2022 • 1
SpeakerVid-5M: A Large-Scale High-Quality Dataset for Audio-Visual Dyadic Interactive Human Generation Paper • 2507.09862 • Published Jul 14 • 49
Training-Free Text-Guided Color Editing with Multi-Modal Diffusion Transformer Paper • 2508.09131 • Published Aug 12 • 16