The Underappreciated Power of Vision Models for Graph Structural Understanding Paper • 2510.24788 • Published Oct 27 • 35
PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model Paper • 2510.14528 • Published Oct 16 • 103
UniMoE-Audio: Unified Speech and Music Generation with Dynamic-Capacity MoE Paper • 2510.13344 • Published Oct 15 • 62