Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels Paper • 2508.17437 • Published Aug 20 • 37
Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation Paper • 2509.00428 • Published Aug 30 • 17
F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions Paper • 2509.06951 • Published Sep 8 • 31