Llama-Guard-3-1B-GGUF-INT8 (Optimized for Android)

This repository provides a quantized INT8 (Q8_0) version of Meta's Llama-Guard-3-1B, converted to the GGUF format for efficient on-device inference.

Built with Llama

Llama Guard 3-1B is a fine-tuned Llama-3.2-1B pretrained model for content safety classification. This specific GGUF version is a derivative work optimized for mobile and edge deployments where accuracy and memory efficiency are critical.

Why INT8 (Q8_0) instead of INT4?

While Meta provides an official 4-bit version, smaller models (1B) can experience a significant increase in False Positive Rates (FPR) when heavily quantized. For example, in German, the INT4 version's FPR can rise to ~14.5%, leading to excessive over-blocking.

This INT8 version maintains approximately 99% of the original model's precision, making it the "Sweet Spot" for developers who need reliable safety layers on Android without frustrating users through false alarms.

Model Details

Base Model: meta-llama/Llama-Guard-3-1B
Quantization: INT8 (Q8_0) via llama.cpp
File Size: ~1.6 GB
Intended Use: Input/Output moderation for LLM-based applications.

Hazard Taxonomy & Policy

The model is trained to detect 13 categories of hazards based on the MLCommons taxonomy:

Category	Description
S1	Violent Crimes
S2	Non-Violent Crimes
S3	Sex-Related Crimes
S4	Child Sexual Exploitation
S5	Defamation
S6	Specialized Advice (Medical/Legal)
S7	Privacy Violations
S8	Intellectual Property
S9	Indiscriminate Weapons
S10	Hate Speech
S11	Suicide & Self-Harm
S12	Sexual Content (Erotica)
S13	Elections (Misinformation)

Usage

Prompt Template

To ensure correct classification, you must use the official Llama-Guard template:

<|begin_of_text|><|start_header_id|>user<|end_header_id|>

[YOUR_INPUT_HERE]<|eot_id|><|start_header_id|>assistant<|end_header_id|>

License & Terms

This model is licensed under the Llama 3.2 Community License Agreement.

Attribution

Acceptable Use Policy

You must comply with Meta's Acceptable Use Policy. By using this model, you agree not to use it for illegal activities, harassment, or any purposes prohibited by Meta's guidelines.

Citation

If you use this model, please cite the original Llama 3 family:

@misc{metallamaguard3,
  author = {Llama Team, AI @ Meta},
  title = {The Llama 3 Family of Models},
  howpublished = {\url{[https://github.com/meta-llama/PurpleLlama/blob/main/Llama-Guard3/1B/MODEL_CARD.md](https://github.com/meta-llama/PurpleLlama/blob/main/Llama-Guard3/1B/MODEL_CARD.md)}},
  year = {2024}
}

Quantization performed by: [Thorge Mrowinski]

Format: GGUF (Compatible with llama-android, llama.rn, and llama.cpp)

Downloads last month: 12

GGUF

Model size

1B params

Architecture

llama

Hardware compatibility

8-bit

Model tree for Thorge-AI/llama-guard-3-1b-q8_0.gguf

Base model

meta-llama/Llama-Guard-3-1B

Quantized

(19)

this model