18 32 32

Max Ku

vinesmsuic

https://kuwingfung.github.io/

AI & ML interests

Computer Vision, World Models

Recent Activity

upvoted a paper about 18 hours ago

OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis

liked a model 27 days ago

nyu-visionx/solaris

liked a Space 29 days ago

multimodalart/qwen-image-multiple-angles-3d-camera

View all activity

Organizations

upvoted a paper about 18 hours ago

OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis

Paper • 2603.20278 • Published 8 days ago • 65

upvoted a paper about 1 month ago

VisPhyWorld: Probing Physical Reasoning via Code-Driven Video Reconstruction

Paper • 2602.13294 • Published Feb 9 • 13

upvoted a paper about 2 months ago

Context Forcing: Consistent Autoregressive Video Generation with Long Context

Paper • 2602.06028 • Published Feb 5 • 36

upvoted a paper 4 months ago

TUNA: Taming Unified Visual Representations for Native Unified Multimodal Models

Paper • 2512.02014 • Published Dec 1, 2025 • 74

upvoted a paper 5 months ago

BrowserAgent: Building Web Agents with Human-Inspired Web Browsing Actions

Paper • 2510.10666 • Published Oct 12, 2025 • 28

upvoted 3 papers 12 months ago

upvoted a collection 12 months ago

TheoremExplain

Collection

2 items • Updated Feb 27, 2025 • 4

upvoted 7 papers about 1 year ago

Position: Interactive Generative Video as Next-Generation Game Engine

Paper • 2503.17359 • Published Mar 21, 2025 • 61

Cube: A Roblox View of 3D Intelligence

Paper • 2503.15475 • Published Mar 19, 2025 • 31

Long-Video Audio Synthesis with Multi-Agent Collaboration

Paper • 2503.10719 • Published Mar 13, 2025 • 9

ReCamMaster: Camera-Controlled Generative Rendering from A Single Video

Paper • 2503.11647 • Published Mar 14, 2025 • 148

SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution

Paper • 2502.18449 • Published Feb 25, 2025 • 75

Mobius: Text to Seamless Looping Video Generation via Latent Shift

Paper • 2502.20307 • Published Feb 27, 2025 • 18

TheoremExplainAgent: Towards Multimodal Explanations for LLM Theorem Understanding

Paper • 2502.19400 • Published Feb 26, 2025 • 47

upvoted 4 papers over 1 year ago

VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation

Paper • 2412.00927 • Published Dec 1, 2024 • 29

OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision

Paper • 2411.07199 • Published Nov 11, 2024 • 50

CCEdit: Creative and Controllable Video Editing via Diffusion Models

Paper • 2309.16496 • Published Sep 28, 2023 • 9

The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery

Paper • 2408.06292 • Published Aug 12, 2024 • 128

Max Ku

AI & ML interests

Recent Activity

Organizations

vinesmsuic's activity