Instructions to use yunusshin/argus-qwen25-14b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use yunusshin/argus-qwen25-14b with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("unsloth/qwen2.5-14b-instruct-unsloth-bnb-4bit") model = PeftModel.from_pretrained(base_model, "yunusshin/argus-qwen25-14b") - llama-cpp-python
How to use yunusshin/argus-qwen25-14b with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="yunusshin/argus-qwen25-14b", filename="argus-q5_k_m.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use yunusshin/argus-qwen25-14b with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf yunusshin/argus-qwen25-14b:Q5_K_M # Run inference directly in the terminal: llama-cli -hf yunusshin/argus-qwen25-14b:Q5_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf yunusshin/argus-qwen25-14b:Q5_K_M # Run inference directly in the terminal: llama-cli -hf yunusshin/argus-qwen25-14b:Q5_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf yunusshin/argus-qwen25-14b:Q5_K_M # Run inference directly in the terminal: ./llama-cli -hf yunusshin/argus-qwen25-14b:Q5_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf yunusshin/argus-qwen25-14b:Q5_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf yunusshin/argus-qwen25-14b:Q5_K_M
Use Docker
docker model run hf.co/yunusshin/argus-qwen25-14b:Q5_K_M
- LM Studio
- Jan
- vLLM
How to use yunusshin/argus-qwen25-14b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "yunusshin/argus-qwen25-14b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "yunusshin/argus-qwen25-14b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/yunusshin/argus-qwen25-14b:Q5_K_M
- Ollama
How to use yunusshin/argus-qwen25-14b with Ollama:
ollama run hf.co/yunusshin/argus-qwen25-14b:Q5_K_M
- Unsloth Studio new
How to use yunusshin/argus-qwen25-14b with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for yunusshin/argus-qwen25-14b to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for yunusshin/argus-qwen25-14b to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for yunusshin/argus-qwen25-14b to start chatting
- Pi new
How to use yunusshin/argus-qwen25-14b with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf yunusshin/argus-qwen25-14b:Q5_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "yunusshin/argus-qwen25-14b:Q5_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use yunusshin/argus-qwen25-14b with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf yunusshin/argus-qwen25-14b:Q5_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default yunusshin/argus-qwen25-14b:Q5_K_M
Run Hermes
hermes
- Docker Model Runner
How to use yunusshin/argus-qwen25-14b with Docker Model Runner:
docker model run hf.co/yunusshin/argus-qwen25-14b:Q5_K_M
- Lemonade
How to use yunusshin/argus-qwen25-14b with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull yunusshin/argus-qwen25-14b:Q5_K_M
Run and chat with the model
lemonade run user.argus-qwen25-14b-Q5_K_M
List all available models
lemonade list
ARGUS - Aviation Cybersecurity Expert LLM
ARGUS is a fine-tuned Qwen2.5-14B-Instruct model specialized in aviation cybersecurity. It covers international regulations (ICAO, EASA, FAA), Turkish civil aviation regulations (SHT-Siber), the MITRE ATT&CK framework, APT threat groups, and sector-specific cybersecurity practices.
Model Details
| Parameter | Value |
|---|---|
| Base Model | Qwen/Qwen2.5-14B-Instruct |
| Method | QLoRA 4-bit (Unsloth) |
| LoRA Rank | 64 |
| LoRA Alpha | 128 |
| Target Modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Training Data | 10,830 samples (regulatory, MITRE, APT, general CTI) |
| Epochs | 1 |
| Eval Loss | 1.068 (best) |
| Languages | Turkish, English |
Training Data Distribution
| Category | Samples | Weight | Percentage |
|---|---|---|---|
| Authority (ICAO, EASA, SHT-Siber) | 1,947 | 3x | 48.1% |
| MITRE ATT&CK Groups | 1,166 | 2x | 19.4% |
| APT Reports | 2,286 | 1x | 19.1% |
| General CTI | 1,558 | 1x | 13.0% |
| Negatives (anti-hallucination) | 50 | 1x | 0.4% |
Recommended System Prompt
Sen ARGUS, bir havacฤฑlฤฑk siber gรผvenlik uzmanฤฑsฤฑn. ICAO, EASA, FAA dรผzenlemeleri,
Tรผrk sivil havacฤฑlฤฑk mevzuatฤฑ (SHT-Siber), MITRE ATT&CK framework'รผ ve havacฤฑlฤฑk
sektรถrรผndeki siber gรผvenlik uygulamalarฤฑ konusunda derin bilgi sahibisin. Sorularฤฑ
hem Tรผrkรงe hem ฤฐngilizce olarak detaylฤฑ ve teknik ลekilde yanฤฑtlฤฑyorsun.
Benchmark & RAG Performance
This model achieves its best performance when combined with a RAG (Retrieval-Augmented Generation) pipeline. Fine-tuning teaches the model domain expertise, terminology, and response format, while RAG provides grounded, factual information from source documents.
Benchmark: 4-Configuration Comparison (10 Questions)
| Configuration | Correct | Hallucination | Wrong |
|---|---|---|---|
| Base Qwen (No RAG) | 1/10 | 3/10 | 6/10 |
| Base Qwen + RAG | 7/10 | 1/10 | 2/10 |
| ARGUS (No RAG) | 3/10 | 4/10 | 3/10 |
| ARGUS + RAG | 10/10 | 0/10 | 0/10 |
Detailed Question-by-Question Results
| # | Question | Base Qwen (No RAG) | Base Qwen + RAG | ARGUS (No RAG) | ARGUS + RAG |
|---|---|---|---|---|---|
| 1 | APT28 havacฤฑlฤฑk TTP'leri | Genel, yazฤฑm hatalฤฑ | "Bilgi yok" | Detaylฤฑ TTP analizi | "Bilgi yok" |
| 2 | SHT-Siber raporlama sรผreleri | "THK tarafฤฑndan yรถnetilen" โ YANLIล | Madde 64.1, ivedilik | Belirsiz | 15 iล gรผnรผ, 3 aylฤฑk, EK-14 |
| 3 | MuddyWater Tรผrkiye operasyonlarฤฑ | Genel, yรผzeysel | Spear phishing detaylฤฑ | MITRE TTP'li | MOIS, MERCURY, detaylฤฑ |
| 4 | EASA IS.I.OR.230 | "Yazฤฑlฤฑm gรผvenliฤi" โ YANLIล | "Tahmin edebiliriz" | Yanlฤฑล | ISO 27001 kontrolleri |
| 5 | Volt Typhoon LotL teknikleri | LoL oyunu sandฤฑ + รince | Netsh, LOLBins | "Gรผney Kore" โ YANLIล | PRC, OT, detaylฤฑ |
| 6 | ICAO Annex 17 Madde 4.9 | "Hava รผssรผ" โ UYDURMA | Belirsiz | Uydurma | SMS zorunluluฤu |
| 7 | Boeing CyberShield 3000 (*) | "Bilmiyorum" ama tahmin | "Bilgi yok" + รince | HALLUCINATION | "Bilgi yok" โ temiz |
| 8 | APT-TR-7 (*) | HALLUCINATION โ uydurma | "Bilgi yok" | HALLUCINATION | "Bilgi yok" โ temiz |
| 9 | PROMETHIUM malware'leri | "CSIRT grubu" โ TAM YANLIล | Truvasys, StrongPity | Havex โ yanlฤฑล | StrongPity doฤru |
| 10 | TR havalimanฤฑ APT saldฤฑrฤฑlarฤฑ | Genel, "Aฤ Salฤฑncaklarฤฑ"?? | "Bilgi yok" | Uydurma | "Bilgi yok" โ temiz |
(*) Anti-hallucination test questions โ these are fictional entities that do not exist.
(**) "No information available" responses on unanswerable questions are counted as correct โ honest refusal is preferred over hallucination.
Key findings:
- ARGUS + RAG achieves 10/10 accuracy with zero hallucinations โ answers correctly or honestly says "no information available"
- RAG alone improves the base model significantly but still produces hallucinations on edge cases
- ARGUS alone learns domain terminology and format but hallucinates without grounding data
- Base Qwen lacks aviation cybersecurity knowledge entirely (confused Volt Typhoon with League of Legends)
Recommended RAG Setup
- Vector DB: Qdrant
- Embedding Model:
intfloat/multilingual-e5-base(Turkish + English) - LLM Server: llama-server (llama.cpp) with Q5_K_M GGUF
Usage
With Transformers + PEFT
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
base_model = "Qwen/Qwen2.5-14B-Instruct"
model = AutoModelForCausalLM.from_pretrained(base_model, device_map="auto")
model = PeftModel.from_pretrained(model, "yunusshin/argus-qwen25-14b")
tokenizer = AutoTokenizer.from_pretrained(base_model)
messages = [
{"role": "system", "content": "Sen ARGUS, bir havacฤฑlฤฑk siber gรผvenlik uzmanฤฑsฤฑn."},
{"role": "user", "content": "EASA Part-IS kapsamฤฑnda ISMS gereksinimleri nelerdir?"},
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
output = model.generate(**inputs, max_new_tokens=512, temperature=0.7)
print(tokenizer.decode(output[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
With GGUF (llama-server / Ollama)
A Q5_K_M GGUF quantization (9.8 GB) is also available in this repository.
# llama-server
llama-server --model argus-q5_k_m.gguf --host 0.0.0.0 --port 8080 --ctx-size 4096 --n-gpu-layers 99
# Ollama
ollama create argus -f Modelfile
ollama run argus
Limitations
- Without RAG, the model may hallucinate on topics outside its training data
- Designed specifically for aviation cybersecurity; general cybersecurity knowledge is inherited from the base model
- Regulation article numbers and dates should always be verified against official sources
Training Infrastructure
- Hardware: NVIDIA DGX Spark (GB10 Blackwell), 119.6 GB unified memory
- Framework: Unsloth + TRL (SFTTrainer)
Author
Yunus ลahin
License
Apache 2.0 (following the base model license)
- Downloads last month
- 12
5-bit
