|
|
--- |
|
|
license: cc0-1.0 |
|
|
base_model: mlx-community/Qwen2.5-Coder-7B-Instruct-4bit |
|
|
tags: |
|
|
- gguf |
|
|
- cybersecurity |
|
|
- nist |
|
|
- security-controls |
|
|
- compliance |
|
|
- fine-tuned |
|
|
- llama-cpp |
|
|
language: |
|
|
- en |
|
|
quantized_by: ethanolivertroy |
|
|
--- |
|
|
|
|
|
# HackIDLE-NIST-Coder v1.1 (GGUF) |
|
|
|
|
|
**The most comprehensive NIST cybersecurity model** in GGUF format - Compatible with llama.cpp, Ollama, LM Studio, and text-generation-webui. |
|
|
|
|
|
## Model Overview |
|
|
|
|
|
Fine-tuned on 530,912 examples from 596 NIST publications. Version 1.1 includes: |
|
|
|
|
|
- **+7,206 training examples** (530,912 total) |
|
|
- **+28 new documents** (596 NIST publications) |
|
|
- **CSWP series**: CSF 2.0, Zero Trust Architecture, Post-Quantum Cryptography |
|
|
- **Improved quality**: Fixed 6,150 malformed DOI links, 0 broken link markers |
|
|
|
|
|
## Available Quantizations |
|
|
|
|
|
| Quantization | Size | Use Case | Description | |
|
|
|--------------|------|----------|-------------| |
|
|
| **F16** | ~14 GB | Reference Quality | Full precision, best quality | |
|
|
| **Q8_0** | ~7.5 GB | High Quality | Minimal quality loss | |
|
|
| **Q5_K_M** | ~5.1 GB | Balanced | Good quality/size trade-off | |
|
|
| **Q4_K_M** | ~4.4 GB | Recommended | Best speed/quality balance | |
|
|
|
|
|
**Recommended**: Start with **Q4_K_M** for best overall performance. |
|
|
|
|
|
## Training Data (v1.1) |
|
|
|
|
|
**Dataset**: [ethanolivertroy/nist-cybersecurity-training](https://huggingface.co/datasets/ethanolivertroy/nist-cybersecurity-training) |
|
|
|
|
|
**Coverage:** |
|
|
- **FIPS**: Cryptographic standards |
|
|
- **SP 800**: Security guidelines and controls |
|
|
- **SP 1800**: Practice guides |
|
|
- **IR**: Technical reports |
|
|
- **CSWP**: White Papers (CSF 2.0, Zero Trust, PQC, IoT, Privacy) β¨ NEW |
|
|
|
|
|
**Stats**: 530,912 examples β’ 596 documents β’ 61,480 working references |
|
|
|
|
|
## Installation |
|
|
|
|
|
### Ollama |
|
|
|
|
|
```bash |
|
|
# Pull from Ollama registry |
|
|
ollama pull etgohome/hackidle-nist-coder:v1.1 |
|
|
|
|
|
# Or create from GGUF |
|
|
ollama create hackidle-nist-coder -f Modelfile |
|
|
``` |
|
|
|
|
|
### LM Studio |
|
|
|
|
|
1. Open LM Studio |
|
|
2. Search for "hackidle-nist-coder" |
|
|
3. Download Q4_K_M or Q5_K_M quantization |
|
|
4. Load and chat |
|
|
|
|
|
### llama.cpp |
|
|
|
|
|
```bash |
|
|
# Clone llama.cpp |
|
|
git clone https://github.com/ggerganov/llama.cpp |
|
|
cd llama.cpp && make |
|
|
|
|
|
# Download model (Q4_K_M recommended) |
|
|
wget https://huggingface.co/ethanolivertroy/HackIDLE-NIST-Coder-v1.1-GGUF/resolve/main/hackidle-nist-coder-v1.1-q4_k_m.gguf |
|
|
|
|
|
# Run inference |
|
|
./llama-cli -m hackidle-nist-coder-v1.1-q4_k_m.gguf -p "What is Zero Trust Architecture?" |
|
|
``` |
|
|
|
|
|
### text-generation-webui |
|
|
|
|
|
1. Place GGUF file in `models/` directory |
|
|
2. Select model in UI |
|
|
3. Load and chat |
|
|
|
|
|
## Usage Examples |
|
|
|
|
|
### Ollama |
|
|
|
|
|
```bash |
|
|
ollama run etgohome/hackidle-nist-coder:v1.1 "Explain the CSF 2.0 GOVERN function" |
|
|
``` |
|
|
|
|
|
### Python (llama-cpp-python) |
|
|
|
|
|
```python |
|
|
from llama_cpp import Llama |
|
|
|
|
|
llm = Llama( |
|
|
model_path="hackidle-nist-coder-v1.1-q4_k_m.gguf", |
|
|
n_ctx=4096, |
|
|
n_threads=8 |
|
|
) |
|
|
|
|
|
response = llm("What are the core principles of Zero Trust Architecture in SP 800-207?", |
|
|
max_tokens=500) |
|
|
print(response['choices'][0]['text']) |
|
|
``` |
|
|
|
|
|
## Model Capabilities |
|
|
|
|
|
Trained on comprehensive NIST content: |
|
|
|
|
|
β
**Security Controls** (SP 800-53) |
|
|
β
**CSF 2.0** with GOVERN function |
|
|
β
**Zero Trust Architecture** (SP 800-207) |
|
|
β
**Risk Management Framework** (RMF) |
|
|
β
**Cloud Security** (SP 800-145, 800-146) |
|
|
β
**FIPS Cryptography** standards |
|
|
β
**Post-Quantum Cryptography** migration |
|
|
β
**Privacy Engineering** |
|
|
β
**Supply Chain Risk Management** |
|
|
β
**IoT Cybersecurity** |
|
|
|
|
|
## What's New in v1.1 |
|
|
|
|
|
**Added Content:** |
|
|
- CSF 2.0 (Cybersecurity Framework 2.0) |
|
|
- Zero Trust Architecture planning guidance |
|
|
- Post-Quantum Cryptography recommendations |
|
|
- IoT security and labeling |
|
|
- Privacy Framework v1.0 |
|
|
- Supply chain risk management case studies |
|
|
|
|
|
**Quality Improvements:** |
|
|
- Fixed 6,150 malformed DOI links |
|
|
- Removed 202 broken link markers |
|
|
- Validated 124,946 total links |
|
|
- Clean training data |
|
|
|
|
|
## System Requirements |
|
|
|
|
|
| Quantization | RAM Required | CPU/GPU | |
|
|
|--------------|-------------|---------| |
|
|
| Q4_K_M | 6 GB | CPU or GPU | |
|
|
| Q5_K_M | 7 GB | CPU or GPU | |
|
|
| Q8_0 | 10 GB | CPU or GPU | |
|
|
| F16 | 16 GB | GPU recommended | |
|
|
|
|
|
## Other Formats |
|
|
|
|
|
- **MLX**: [ethanolivertroy/HackIDLE-NIST-Coder-v1.1-MLX-4bit](https://huggingface.co/ethanolivertroy/HackIDLE-NIST-Coder-v1.1-MLX-4bit) (Apple Silicon) |
|
|
- **Ollama**: [etgohome/hackidle-nist-coder](https://ollama.com/etgohome/hackidle-nist-coder) |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- Training data current as of October 2025 |
|
|
- May not reflect NIST publications released after training |
|
|
- 54.2% of references are broken links (cataloged for recovery) |
|
|
- Optimized for NIST-specific cybersecurity questions |
|
|
|
|
|
## Citation |
|
|
|
|
|
```bibtex |
|
|
@misc{hackidle-nist-coder-v1.1-gguf, |
|
|
title={HackIDLE-NIST-Coder: NIST Cybersecurity Expert Model}, |
|
|
author={Troy, Ethan Oliver}, |
|
|
year={2025}, |
|
|
version={1.1}, |
|
|
format={GGUF}, |
|
|
url={https://huggingface.co/ethanolivertroy/HackIDLE-NIST-Coder-v1.1-GGUF} |
|
|
} |
|
|
``` |
|
|
|
|
|
## License |
|
|
|
|
|
CC0 1.0 Universal (Public Domain) - All NIST publications are in the public domain. |
|
|
|
|
|
## Acknowledgments |
|
|
|
|
|
- NIST Computer Security Resource Center |
|
|
- Qwen2.5-Coder base model (Alibaba Cloud) |
|
|
- llama.cpp quantization (Georgi Gerganov) |
|
|
- MLX framework (Apple) |
|
|
|
|
|
--- |
|
|
|
|
|
**Version**: 1.1 |
|
|
**Release Date**: October 2025 |
|
|
**Training Dataset**: [nist-cybersecurity-training v1.1](https://huggingface.co/datasets/ethanolivertroy/nist-cybersecurity-training) |
|
|
**Format**: GGUF (compatible with llama.cpp ecosystem) |
|
|
|