BLEUBERI This collection contains datasets and models related to "BLEUBERI: BLEU is a surprisingly effective reward for instruction following". BLEUBERI: BLEU is a surprisingly effective reward for instruction following Paper • 2505.11080 • Published May 16 • 5 yapeichang/BLEUBERI-Tulu3-50k Viewer • Updated Jun 9 • 50k • 36 • 1 yapeichang/Qwen2.5-7B-BLEUBERI Text Generation • Updated Jun 17 • 22 • 1 yapeichang/Qwen2.5-7B-RM8B Text Generation • Updated Jun 5 • 18
BLEUBERI: BLEU is a surprisingly effective reward for instruction following Paper • 2505.11080 • Published May 16 • 5
BLEUBERI This collection contains datasets and models related to "BLEUBERI: BLEU is a surprisingly effective reward for instruction following". BLEUBERI: BLEU is a surprisingly effective reward for instruction following Paper • 2505.11080 • Published May 16 • 5 yapeichang/BLEUBERI-Tulu3-50k Viewer • Updated Jun 9 • 50k • 36 • 1 yapeichang/Qwen2.5-7B-BLEUBERI Text Generation • Updated Jun 17 • 22 • 1 yapeichang/Qwen2.5-7B-RM8B Text Generation • Updated Jun 5 • 18
BLEUBERI: BLEU is a surprisingly effective reward for instruction following Paper • 2505.11080 • Published May 16 • 5