Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

kapilw25
/
llama3-8b-pku-GRPO-Instruct-SFT-Instruct

Transformers
Safetensors
alignment
safety
grpo
llama-3
Model card Files Files and versions
xet
Community
llama3-8b-pku-GRPO-Instruct-SFT-Instruct
186 MB
  • 1 contributor
History: 5 commits
kapilw25's picture
kapilw25
Add trainer state (metric: 1.6000)
f43df80 verified 3 months ago
  • .gitattributes
    1.57 kB
    Upload tokenizer 3 months ago
  • README.md
    2.59 kB
    Add model card with training hyperparameters 3 months ago
  • adapter_config.json
    934 Bytes
    CITA PBT BF16 Training (LoRA Adapter) 3 months ago
  • adapter_model.safetensors
    168 MB
    xet
    CITA PBT BF16 Training (LoRA Adapter) 3 months ago
  • special_tokens_map.json
    335 Bytes
    Upload tokenizer 3 months ago
  • tokenizer.json
    17.2 MB
    xet
    Upload tokenizer 3 months ago
  • tokenizer_config.json
    50.6 kB
    Upload tokenizer 3 months ago
  • trainer_state.json
    521 kB
    Add trainer state (metric: 1.6000) 3 months ago