Tran Tuan Huy's picture

20

Tran Tuan Huy

HuyTT

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 18 days ago

Retrieve and Segment: Are a Few Examples Enough to Bridge the Supervision Gap in Open-Vocabulary Segmentation?

upvoted a paper 18 days ago

MediX-R1: Open Ended Medical Reinforcement Learning

upvoted a paper 19 days ago

Imagination Helps Visual Reasoning, But Not Yet in Latent Space

View all activity

Organizations

None yet

upvoted 2 papers 18 days ago

Retrieve and Segment: Are a Few Examples Enough to Bridge the Supervision Gap in Open-Vocabulary Segmentation?

Paper • 2602.23339 • Published 22 days ago • 6

MediX-R1: Open Ended Medical Reinforcement Learning

Paper • 2602.23363 • Published 22 days ago • 22

upvoted 2 papers 19 days ago

Imagination Helps Visual Reasoning, But Not Yet in Latent Space

Paper • 2602.22766 • Published 23 days ago • 42

From Blind Spots to Gains: Diagnostic-Driven Iterative Training for Large Multimodal Models

Paper • 2602.22859 • Published 22 days ago • 150

upvoted a paper 22 days ago

Query-focused and Memory-aware Reranker for Long Context Processing

Paper • 2602.12192 • Published Feb 12 • 57

upvoted 2 papers 26 days ago

RynnBrain: Open Embodied Foundation Models

Paper • 2602.14979 • Published Feb 13 • 43

Does Your Reasoning Model Implicitly Know When to Stop Thinking?

Paper • 2602.08354 • Published Feb 9 • 262

upvoted 2 papers about 1 month ago

Good SFT Optimizes for SFT, Better SFT Prepares for Reinforcement Learning

Paper • 2602.01058 • Published Feb 1 • 42

Kimi K2.5: Visual Agentic Intelligence

Paper • 2602.02276 • Published Feb 2 • 259

upvoted 4 papers about 2 months ago

GutenOCR: A Grounded Vision-Language Front-End for Documents

Paper • 2601.14490 • Published Jan 20 • 37

Agentic Reasoning for Large Language Models

Paper • 2601.12538 • Published Jan 18 • 202

UniX: Unifying Autoregression and Diffusion for Chest X-Ray Understanding and Generation

Paper • 2601.11522 • Published Jan 16 • 18

Your Group-Relative Advantage Is Biased

Paper • 2601.08521 • Published Jan 13 • 158

upvoted 7 papers 2 months ago

LaViT: Aligning Latent Visual Thoughts for Multi-modal Reasoning

Paper • 2601.10129 • Published Jan 15 • 12

LSRIF: Logic-Structured Reinforcement Learning for Instruction Following

Paper • 2601.06431 • Published Jan 10 • 12

Molmo2: Open Weights and Data for Vision-Language Models with Video Understanding and Grounding

Paper • 2601.10611 • Published Jan 15 • 30

Action100M: A Large-scale Video Action Dataset

Paper • 2601.10592 • Published Jan 15 • 30

Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs

Paper • 2601.08763 • Published Jan 13 • 149

Urban Socio-Semantic Segmentation with Vision-Language Reasoning

Paper • 2601.10477 • Published Jan 15 • 156

STEP3-VL-10B Technical Report

Paper • 2601.09668 • Published Jan 14 • 195