Wenhan Ma's picture

1 4 14

Wenhan Ma

CuteNPC

·

https://github.com/CuteNPC

CuteNPC

AI & ML interests

Large Language Model

Recent Activity

upvoted a paper 10 days ago

Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

liked a model 25 days ago

Lansechen/deepseek-v2-lite-16b-chat-R1-Distill-bs17k-batch32

authored a paper about 2 months ago

Stabilizing MoE Reinforcement Learning by Aligning Training and Inference Routers

View all activity

Organizations

None yet

authored a paper about 2 months ago

Stabilizing MoE Reinforcement Learning by Aligning Training and Inference Routers

Paper • 2510.11370 • Published Oct 13 • 3

authored a paper 6 months ago

MiMo-VL Technical Report

Paper • 2506.03569 • Published Jun 4 • 80

authored a paper 7 months ago

MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining

Paper • 2505.07608 • Published May 12 • 82