Llama-Guard-3-1B-GGUF-INT8 (Optimized for Android)
This repository provides a quantized INT8 (Q8_0) version of Meta's Llama-Guard-3-1B, converted to the GGUF format for efficient on-device inference.
Built with Llama
Llama Guard 3-1B is a fine-tuned Llama-3.2-1B pretrained model for content safety classification. This specific GGUF version is a derivative work optimized for mobile and edge deployments where accuracy and memory efficiency are critical.
Why INT8 (Q8_0) instead of INT4?
While Meta provides an official 4-bit version, smaller models (1B) can experience a significant increase in False Positive Rates (FPR) when heavily quantized. For example, in German, the INT4 version's FPR can rise to ~14.5%, leading to excessive over-blocking.
This INT8 version maintains approximately 99% of the original model's precision, making it the "Sweet Spot" for developers who need reliable safety layers on Android without frustrating users through false alarms.
Model Details
- Base Model: meta-llama/Llama-Guard-3-1B
- Quantization: INT8 (Q8_0) via
llama.cpp - File Size: ~1.6 GB
- Intended Use: Input/Output moderation for LLM-based applications.
Hazard Taxonomy & Policy
The model is trained to detect 13 categories of hazards based on the MLCommons taxonomy:
| Category | Description |
|---|---|
| S1 | Violent Crimes |
| S2 | Non-Violent Crimes |
| S3 | Sex-Related Crimes |
| S4 | Child Sexual Exploitation |
| S5 | Defamation |
| S6 | Specialized Advice (Medical/Legal) |
| S7 | Privacy Violations |
| S8 | Intellectual Property |
| S9 | Indiscriminate Weapons |
| S10 | Hate Speech |
| S11 | Suicide & Self-Harm |
| S12 | Sexual Content (Erotica) |
| S13 | Elections (Misinformation) |
Usage
Prompt Template
To ensure correct classification, you must use the official Llama-Guard template:
<|begin_of_text|><|start_header_id|>user<|end_header_id|>
[YOUR_INPUT_HERE]<|eot_id|><|start_header_id|>assistant<|end_header_id|>
License & Terms
This model is licensed under the Llama 3.2 Community License Agreement.
Attribution
"Llama 3.2 is licensed under the Llama 3.2 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved."
Acceptable Use Policy
You must comply with Meta's Acceptable Use Policy. By using this model, you agree not to use it for illegal activities, harassment, or any purposes prohibited by Meta's guidelines.
Citation
If you use this model, please cite the original Llama 3 family:
@misc{metallamaguard3,
author = {Llama Team, AI @ Meta},
title = {The Llama 3 Family of Models},
howpublished = {\url{[https://github.com/meta-llama/PurpleLlama/blob/main/Llama-Guard3/1B/MODEL_CARD.md](https://github.com/meta-llama/PurpleLlama/blob/main/Llama-Guard3/1B/MODEL_CARD.md)}},
year = {2024}
}
Quantization performed by: [Thorge Mrowinski]
Format: GGUF (Compatible with llama-android, llama.rn, and llama.cpp)
- Downloads last month
- 12
8-bit
Model tree for Thorge-AI/llama-guard-3-1b-q8_0.gguf
Base model
meta-llama/Llama-Guard-3-1B