Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Zizhuo Zhang's picture
29 2 2

Zizhuo Zhang PRO

resistz
jiaxianustc's profile picture
·
  • resistzzz

AI & ML interests

None yet

Recent Activity

updated a model 10 days ago
resistz/GT-GRPO_Llama-3.2-3B-Instruct_NQ-HotpotQA
published a model 10 days ago
resistz/GT-GRPO_Llama-3.2-3B-Instruct_NQ-HotpotQA
updated a model 11 days ago
TMLR-Group-HF/Co-rewarding-III-Llama-3.2-3B-Instruct-DAPO14k
View all activity

Organizations

TMLR Group's profile picture

upvoted a paper 5 months ago

Co-Reward: Self-supervised Reinforcement Learning for Large Language Model Reasoning via Contrastive Agreement

Paper • 2508.00410 • Published Aug 1, 2025 • 1
upvoted a collection 5 months ago

Co-rewarding

Collection
Co-rewarding is a novel self-supervised RL framework that improves training stability by seeking complementary supervision from another views. • 75 items • Updated 11 days ago • 1
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs