Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

kapilw25
/

llama3-8b-pku-GRPO-Instruct-SFT-Instruct

Model card Files Files and versions

llama3-8b-pku-GRPO-Instruct-SFT-Instruct

186 MB

1 contributor

History: 5 commits

kapilw25's picture

Add trainer state (metric: 1.6000)

f43df80 verified 3 months ago

.gitattributes
1.57 kB

Upload tokenizer 3 months ago
README.md
2.59 kB

Add model card with training hyperparameters 3 months ago
adapter_config.json
934 Bytes

CITA PBT BF16 Training (LoRA Adapter) 3 months ago
adapter_model.safetensors
168 MB
xet

CITA PBT BF16 Training (LoRA Adapter) 3 months ago
special_tokens_map.json
335 Bytes

Upload tokenizer 3 months ago
tokenizer.json
17.2 MB
xet

Upload tokenizer 3 months ago
tokenizer_config.json
50.6 kB

Upload tokenizer 3 months ago
trainer_state.json
521 kB

Add trainer state (metric: 1.6000) 3 months ago