You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Badllama-3-8B

This repo holds weights for Palisade Research's showcase of how open-weight model guardrails can be stripped off in minutes of GPU time. See the Badllama 3 paper for additional background.

Access

Email the authors to request research access. We do not review access requests made on HuggingFace.

Branches

main mirrors ro_authors_awq and holds our best-performing model built with refusal orthogonalization
qlora holds the QLoRA-tuned model
reft holds the ReFT-tuned model