Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
kapilw25
/
llama3-8b-pku-GRPO-Instruct-SFT-Instruct
like
0
Transformers
Safetensors
PKU-Alignment/PKU-SafeRLHF
alignment
safety
grpo
llama-3
License:
llama3.1
Model card
Files
Files and versions
xet
Community
Deploy
Use this model
main
llama3-8b-pku-GRPO-Instruct-SFT-Instruct
186 MB
1 contributor
History:
5 commits
kapilw25
Add trainer state (metric: 1.6000)
f43df80
verified
3 months ago
.gitattributes
1.57 kB
Upload tokenizer
3 months ago
README.md
2.59 kB
Add model card with training hyperparameters
3 months ago
adapter_config.json
934 Bytes
CITA PBT BF16 Training (LoRA Adapter)
3 months ago
adapter_model.safetensors
168 MB
xet
CITA PBT BF16 Training (LoRA Adapter)
3 months ago
special_tokens_map.json
335 Bytes
Upload tokenizer
3 months ago
tokenizer.json
17.2 MB
xet
Upload tokenizer
3 months ago
tokenizer_config.json
50.6 kB
Upload tokenizer
3 months ago
trainer_state.json
521 kB
Add trainer state (metric: 1.6000)
3 months ago