arxiv:2402.17139
Sherry Yang
sherryy
AI & ML interests
None yet
Organizations
None yet
models
10
sherryy/Qwen2-0.5B-GRPO-test
Updated
sherryy/best5-next10-nopizza-nonomad_sft_90
Text Generation
•
8B
•
Updated
•
6
sherryy/pizza_rwr_2k-1k
Text Generation
•
8B
•
Updated
•
7
sherryy/pizza_rwr_k10_iter1
Text Generation
•
8B
•
Updated
•
7
sherryy/pizza_rwr_iter1
Text Generation
•
8B
•
Updated
•
6
sherryy/pizza_rwr_k10
Text Generation
•
8B
•
Updated
•
8
sherryy/pizza_rwr
Text Generation
•
8B
•
Updated
•
8
sherryy/pizza_sft_90
Text Generation
•
8B
•
Updated
•
7
sherryy/pizza_sft
Text Generation
•
8B
•
Updated
•
9
sherryy/math-baseline
Text Generation
•
8B
•
Updated
•
9
datasets
14
sherryy/best5-next10-nopizza-nonomad_sft_90
Viewer
•
Updated
•
78.6k
•
31
sherryy/pizza_rwr_k10_iter1
Viewer
•
Updated
•
24.4k
•
14
sherryy/pizza_rwr_iter1
Viewer
•
Updated
•
42.4k
•
8
sherryy/pizza_rwr
Viewer
•
Updated
•
83k
•
31
sherryy/tree_dataset
Viewer
•
Updated
•
11.1k
•
24
sherryy/pizza_sft
Viewer
•
Updated
•
37.8k
•
38
sherryy/pizza_dpo
Viewer
•
Updated
•
5.61k
•
17
sherryy/math12k
Viewer
•
Updated
•
12.5k
•
15
sherryy/random-acts-of-pizza
Viewer
•
Updated
•
59.5k
•
99
sherryy/test_data
Updated
•
3