Llama-Guard-3-1B-GGUF-INT8 (Optimized for Android)

This repository provides a quantized INT8 (Q8_0) version of Meta's Llama-Guard-3-1B, converted to the GGUF format for efficient on-device inference.

Built with Llama

Llama Guard 3-1B is a fine-tuned Llama-3.2-1B pretrained model for content safety classification. This specific GGUF version is a derivative work optimized for mobile and edge deployments where accuracy and memory efficiency are critical.

Why INT8 (Q8_0) instead of INT4?

While Meta provides an official 4-bit version, smaller models (1B) can experience a significant increase in False Positive Rates (FPR) when heavily quantized. For example, in German, the INT4 version's FPR can rise to ~14.5%, leading to excessive over-blocking.

This INT8 version maintains approximately 99% of the original model's precision, making it the "Sweet Spot" for developers who need reliable safety layers on Android without frustrating users through false alarms.

Model Details

  • Base Model: meta-llama/Llama-Guard-3-1B
  • Quantization: INT8 (Q8_0) via llama.cpp
  • File Size: ~1.6 GB
  • Intended Use: Input/Output moderation for LLM-based applications.

Hazard Taxonomy & Policy

The model is trained to detect 13 categories of hazards based on the MLCommons taxonomy:

Category Description
S1 Violent Crimes
S2 Non-Violent Crimes
S3 Sex-Related Crimes
S4 Child Sexual Exploitation
S5 Defamation
S6 Specialized Advice (Medical/Legal)
S7 Privacy Violations
S8 Intellectual Property
S9 Indiscriminate Weapons
S10 Hate Speech
S11 Suicide & Self-Harm
S12 Sexual Content (Erotica)
S13 Elections (Misinformation)

Usage

Prompt Template

To ensure correct classification, you must use the official Llama-Guard template:

<|begin_of_text|><|start_header_id|>user<|end_header_id|>

[YOUR_INPUT_HERE]<|eot_id|><|start_header_id|>assistant<|end_header_id|>

License & Terms

This model is licensed under the Llama 3.2 Community License Agreement.

Attribution

"Llama 3.2 is licensed under the Llama 3.2 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved."

Acceptable Use Policy

You must comply with Meta's Acceptable Use Policy. By using this model, you agree not to use it for illegal activities, harassment, or any purposes prohibited by Meta's guidelines.

Citation

If you use this model, please cite the original Llama 3 family:

@misc{metallamaguard3,
  author = {Llama Team, AI @ Meta},
  title = {The Llama 3 Family of Models},
  howpublished = {\url{[https://github.com/meta-llama/PurpleLlama/blob/main/Llama-Guard3/1B/MODEL_CARD.md](https://github.com/meta-llama/PurpleLlama/blob/main/Llama-Guard3/1B/MODEL_CARD.md)}},
  year = {2024}
}

Quantization performed by: [Thorge Mrowinski]

Format: GGUF (Compatible with llama-android, llama.rn, and llama.cpp)

Downloads last month
12
GGUF
Model size
1B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Thorge-AI/llama-guard-3-1b-q8_0.gguf

Quantized
(19)
this model