BartBanaFinal

This model is a fine-tuned version of IAmSkyDra/BARTBana_v4 on the generator dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9377
  • Sacrebleu: 7.6359
  • Chrf++: 20.8294

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 256
  • eval_batch_size: 256
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 512
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • training_steps: 74268
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Chrf++ Validation Loss Sacrebleu
No log 0 0 12.4337 43.2284 1.4192
0.0991 0.0269 2000 11.2880 1.2912 1.1187
0.0809 0.0539 4000 11.5611 1.3142 1.3828
0.0753 0.0808 6000 13.3735 1.1897 2.0364
0.0766 0.1077 8000 14.2949 1.1585 2.6931
0.0716 0.1346 10000 15.6571 1.1603 3.1694
0.0683 0.1616 12000 16.1006 1.1283 3.6586
0.0749 0.1885 14000 15.8517 1.1128 3.9157
0.0667 0.2154 16000 16.5723 1.1299 4.4107
0.0667 0.2154 16000 1.1299 4.4107 16.5723
0.0696 0.2424 18000 1.1231 4.8033 16.8438
0.0646 0.2693 20000 1.0649 5.0907 17.1505
0.0561 0.2962 22000 1.1106 5.3579 17.5405
0.0535 0.3232 24000 1.1279 5.6159 17.4801
0.0596 0.3501 26000 1.0895 6.4406 18.6337
0.0586 0.3770 28000 1.0839 6.5442 18.9215
0.0638 0.4039 30000 1.0565 6.4225 18.5386
0.0629 0.4309 32000 1.1079 6.4306 18.5905
0.0632 0.4578 34000 1.0776 6.9070 19.1286
0.0611 0.4847 36000 0.9987 7.2865 19.6978
0.0624 0.5117 38000 1.0437 6.9193 19.0454
0.0664 0.5386 40000 1.0551 6.8607 19.9180
0.0683 0.5655 42000 1.0556 6.8791 19.4370
0.0696 0.5924 44000 0.9795 6.8530 19.4825
0.0731 0.6194 46000 0.9630 6.9622 19.5364
0.0758 0.6463 48000 0.9629 7.4617 20.5333
0.0797 0.6732 50000 0.9573 7.0331 20.3342
0.0735 0.7002 52000 0.9952 6.8602 20.2803
0.0945 0.7271 54000 0.9377 7.6359 20.8294
0.0909 0.7540 56000 0.9200 7.3479 20.3585
0.0937 0.7810 58000 0.8964 7.5754 20.9843
0.102 0.8079 60000 0.9248 7.4648 20.8126
0.1097 0.8348 62000 0.9134 7.4953 21.2279
0.116 0.8617 64000 0.9139 7.1967 20.8760
0.1227 0.8887 66000 0.9069 7.4669 21.0805
0.1426 0.9156 68000 0.8971 7.3416 20.9100
0.1782 0.9425 70000 0.8868 7.3209 20.8953

Framework versions

  • Transformers 4.57.1
  • Pytorch 2.9.0+cu128
  • Datasets 4.4.1
  • Tokenizers 0.22.1
Downloads last month
6
Safetensors
Model size
0.4B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for IAmSkyDra/BartBanaFinal

Finetuned
(10)
this model

Evaluation results