Yuxuan Wang's picture

14 9 3

Yuxuan Wang

ColorfulAI

·

https://patrick-tssn.github.io/

patrick-tssn

AI & ML interests

Multimodal Learning

Recent Activity

authored a paper 11 days ago

Qwen3-Omni Technical Report

authored a paper 11 days ago

OmniVideoBench: Towards Audio-Visual Understanding Evaluation for Omni MLLMs

authored a paper 11 days ago

V-HUB: A Visual-Centric Humor Understanding Benchmark for Video LLMs

View all activity

Organizations

Collections 2

Papers 21

arxiv:2510.12720

arxiv:2510.10689

arxiv:2509.25773

arxiv:2509.17765

models 10

ColorfulAI/M4-Audio-LongVA-7B-Qwen2

Video-Text-to-Text • 9B • Updated Apr 3 • 63

ColorfulAI/M4-LongVA-7B-Qwen2

Video-Text-to-Text • 8B • Updated Apr 3 • 26

ColorfulAI/OpenOmni-8B-Llama3-Omni

9B • Updated Apr 2 • 2 • 1

ColorfulAI/OpenOmni-7B-Qwen2-Omni

9B • Updated Apr 2 • 8 • 1

ColorfulAI/LongVA-7B-Qwen2-Audio

9B • Updated Apr 1 • 3

ColorfulAI/LongVA-7B-Qwen2-VoiceAssistant

9B • Updated Apr 1 • 4

ColorfulAI/Llama-3.1-8B-S2S-Omni

9B • Updated Apr 1 • 5

ColorfulAI/videollamb-llava-1.5-7b

Video-Text-to-Text • 7B • Updated Sep 9, 2024 • 21 • 4

ColorfulAI/videollamb-mem-llava-1.5-7b

7B • Updated Aug 12, 2024 • 3

ColorfulAI/LSTP-Chat

Image-Text-to-Text • Updated Aug 2, 2024 • 4

datasets 7

ColorfulAI/MoviePuzzle

Viewer • Updated May 14 • 1 • 13

ColorfulAI/M4-IT

Updated Apr 3 • 146 • 1

ColorfulAI/VoiceAssistant_units

Viewer • Updated Apr 2 • 428k • 24

ColorfulAI/LLaVA-NeXT-Speech

Updated Apr 1 • 743

ColorfulAI/NeedleInAVideoHaystack

Viewer • Updated Jan 22 • 21 • 53

ColorfulAI/EgoPlan_test

Viewer • Updated Sep 15, 2024 • 923 • 391

ColorfulAI/VideoLLaMB-IT

Viewer • Updated Aug 12, 2024 • 1.03M • 17