tttx
/

models-p10-limit-data-step1

alignment-handbook

Generated from Trainer

Model card Files Files and versions

aadityap commited on Feb 10

Commit

8dbc7b9

·

verified ·

1 Parent(s): a0b94ea

End of training

Files changed (1) hide show

README.md +4 -1

README.md CHANGED Viewed

@@ -3,9 +3,12 @@ library_name: peft
 license: mit
 base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
 tags:
 - trl
 - sft
 - generated_from_trainer
 model-index:
 - name: models-p10-limit-data-step1
   results: []
@@ -16,7 +19,7 @@ should probably proofread and complete it, then remove this comment. -->
 # models-p10-limit-data-step1
-This model is a fine-tuned version of [deepseek-ai/DeepSeek-R1-Distill-Qwen-32B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B) on an unknown dataset.
 It achieves the following results on the evaluation set:
 - Loss: 0.1200

 license: mit
 base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
 tags:
+- alignment-handbook
 - trl
 - sft
 - generated_from_trainer
+datasets:
+- tttx/p10-limit-data-step1-master
 model-index:
 - name: models-p10-limit-data-step1
   results: []
 # models-p10-limit-data-step1
+This model is a fine-tuned version of [tttx/15k_sft_5ep_020925_s1_hp](https://huggingface.co/tttx/15k_sft_5ep_020925_s1_hp) on the tttx/p10-limit-data-step1-master dataset.
 It achieves the following results on the evaluation set:
 - Loss: 0.1200