5 25 11

ytz

ytz20

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

Embarrassingly Simple Self-Distillation Improves Code Generation

upvoted a paper 1 day ago

Universal YOCO for Efficient Depth Scaling

updated a dataset 15 days ago

ytz20/sys_safety_test

View all activity

Organizations

None yet

upvoted 2 papers 1 day ago

Embarrassingly Simple Self-Distillation Improves Code Generation

Paper • 2604.01193 • Published 2 days ago • 15

Universal YOCO for Efficient Depth Scaling

Paper • 2604.01220 • Published 2 days ago • 12

upvoted a paper 15 days ago

On-Policy Context Distillation for Language Models

Paper • 2602.12275 • Published Feb 12 • 3

upvoted a paper 17 days ago

Online Experiential Learning for Language Models

Paper • 2603.16856 • Published 17 days ago • 57

upvoted a paper 2 months ago

Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge

Paper • 2601.08808 • Published Jan 13 • 39

upvoted an article 2 months ago

Article

Differential Transformer V2

Jan 20

•

upvoted a collection 5 months ago

GAD-Models

Collection

Model checkpoints of Black-Box On-Policy Distillation of Large Language Models • 5 items • Updated Nov 17, 2025 • 6

upvoted 2 papers 5 months ago

Motif 2 12.7B technical report

Paper • 2511.07464 • Published Nov 7, 2025 • 40

Black-Box On-Policy Distillation of Large Language Models

Paper • 2511.10643 • Published Nov 13, 2025 • 52

upvoted a paper 6 months ago

BitNet Distillation

Paper • 2510.13998 • Published Oct 15, 2025 • 59

upvoted a paper 7 months ago

VibeVoice Technical Report

Paper • 2508.19205 • Published Aug 26, 2025 • 149

upvoted a paper 9 months ago

Differential Mamba

Paper • 2507.06204 • Published Jul 8, 2025 • 19

upvoted 2 papers 10 months ago

Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9, 2025 • 265

Rectified Sparse Attention

Paper • 2506.04108 • Published Jun 4, 2025 • 11

upvoted a paper 12 months ago

Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

Paper • 2504.13837 • Published Apr 18, 2025 • 141

upvoted 2 papers about 1 year ago

4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models

Paper • 2503.10437 • Published Mar 13, 2025 • 34

DiffCLIP: Differential Attention Meets CLIP

Paper • 2503.06626 • Published Mar 9, 2025 • 5

upvoted a collection about 1 year ago

DiffCLIP

Collection

Official models for DiffCLIP: Differential Attention Meets CLIP • 4 items • Updated Mar 9, 2025 • 3

upvoted a paper about 1 year ago

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Paper • 2502.05171 • Published Feb 7, 2025 • 154

upvoted a paper over 1 year ago

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 377

ytz

AI & ML interests

Recent Activity

Organizations

ytz20's activity

Differential Transformer V2