Instructions to use dpevzner/CyberOps_Mistral_7B_LoRA with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use dpevzner/CyberOps_Mistral_7B_LoRA with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-Instruct-v0.3") model = PeftModel.from_pretrained(base_model, "dpevzner/CyberOps_Mistral_7B_LoRA") - Notebooks
- Google Colab
- Kaggle
CyberOps-Mistral-7B-LoRA
A LoRA fine-tuned adapter for Mistral 7B Instruct v0.3, specialized for cybersecurity IT operations reasoning β dual-use command intent analysis, failure diagnosis, security scope assessment, and cross-shell translation.
Adapter only β requires
mistralai/Mistral-7B-Instruct-v0.3as the base model.
Model Details
| Field | Value |
|---|---|
| Base model | mistralai/Mistral-7B-Instruct-v0.3 |
| Adapter type | LoRA (PEFT) |
| Trainable parameters | 41,943,040 (1.10% of base) |
| LoRA rank | 16 |
| Training epochs | 3 |
| Training records | 3,329 |
| Precision | fp16 (base) / fp32 (training) |
| Framework | HuggingFace Transformers + PEFT + TRL |
| Hardware | NVIDIA RTX 4070 Laptop GPU (8GB VRAM) |
| Training time | ~8.5 hours per run |
| Training dataset | dpevzner/Cybersecurity_Reasoning_Dataset |
Intended Use
This model is designed for offline, air-gapped cybersecurity operations environments. It is a domain-specific reasoning assistant for IT security testing.
Primary capabilities
- Dual-use command intent analysis β Given an ambiguous security command (e.g.,
Get-Process lsass,net user /domain,Get-ADUser), the model produces structured benign and malicious interpretations with specific contextual reasoning. - Failure diagnosis β Classifies shell and tool failures by type:
privilege_failure,syntax_failure,environment_mismatch,dependency_absence,stale_documentation. - Security scope violation identification β Flags commands that exceed authorized scope and recommends compliant alternatives.
- Tool correctness β Recommends correct PowerShell, CMD, and bash commands for security-relevant tasks including registry queries, event log analysis, firewall management, and process memory analysis.
- Cross-shell translation β Converts security commands between PowerShell, bash, and CMD with behavioral parity notes.
Out of scope
- General threat intelligence, malware analysis, CVE assessment, and network forensics are not covered in the current training corpus (planned for Phase 3 expansion).
- Not suitable for use as a general-purpose assistant.
- Not designed for or tested on offensive security automation.
Training Data
The training corpus contains 3,329 synthetic records across three curriculum epochs:
| Epoch | Folders | Records | Content |
|---|---|---|---|
| 1 | 03_toolknowledge, 05_detection_rules |
1,070 | Tool syntax, SIGMA rules, network knowledge |
| 2 | 04_goldens, 05_seedcases |
1,331 | Golden responses, execution traces, seedcases |
| 3 | 06_contrast_pairs |
928 | Contrast pairs, terminology, format-repair, gap-fill records |
No real incident data, PII, or proprietary content is included.
The training corpus is derived from and continues to be updated alongside the dpevzner/Cybersecurity_Reasoning_Dataset on HuggingFace, which is maintained as an active part of this project.
Benchmark Performance
Evaluated on a 50-item rubric-scored benchmark across 5 families:
| Family | Items | Baseline (Run 6) | Run 13 | Delta |
|---|---|---|---|---|
| tool_correctness | 10 | ~73 avg | ~81 avg | +8 |
| cross_shell_translation | 10 | ~71 avg | ~72 avg | +1 |
| failure_diagnosis | 10 | ~30 avg | ~70 avg | +40 |
| ambiguity_reasoning | 10 | ~93 avg | ~84 avg | -9 |
| safety_scope | 10 | ~10 avg | ~30 avg | +20 |
| Overall | 50 | 56.28 | 69.43 | +13.15 |
Baseline (untuned Mistral 7B Instruct v0.3): 27.43 avg
Trained on 3,329 records across a structured three-epoch curriculum. Run 13 represents the best-performing checkpoint, trained from a clean base adapter to prevent format interference from prior synthesis batches.
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
BASE_MODEL = "mistralai/Mistral-7B-Instruct-v0.3"
ADAPTER = "your-username/CyberOps-Mistral-7B-LoRA"
tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL)
model = AutoModelForCausalLM.from_pretrained(
BASE_MODEL,
dtype=torch.float16,
device_map="auto",
max_memory={0: "5GiB", "cpu": "20GiB"}
)
model = PeftModel.from_pretrained(model, ADAPTER)
model.eval()
prompt = """### Instruction:
Analyze the ambiguous command: `Get-Process lsass`
### Response:"""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
out = model.generate(**inputs, max_new_tokens=256, do_sample=False, use_cache=False)
print(tokenizer.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
Training Curriculum Design
The three-epoch curriculum is carefully ordered to prevent interference between learning phases:
- Epoch 1 teaches foundational tool knowledge and detection rules β syntax, correct commands, platform-specific behavior
- Epoch 2 teaches execution context β golden responses, ambiguity reasoning, security scope awareness
- Epoch 3 teaches contrast and refinement β dual-use contrast pairs, terminology precision, gap repair
Critical: New synthesis batches must always be placed in Epoch 3 (
06_contrast_pairs). Placing new records in Epoch 2 caused catastrophic regression in ambiguity reasoning (Run 7: 8/10 PASS β 0/10 PASS). The curriculum order is a hard constraint, not a preference.
Limitations
- Ambiguity reasoning scores vary across items β ar_009 and ar_010 consistently score below baseline, indicating the model produces the correct structure but misses specific required keyword anchors for those scenarios.
- Safety scope detection is improving but inconsistent β ss_001, ss_004, ss_006, ss_007, ss_010 remain at zero across runs.
- GPU telemetry (util%, temp, power) reads zero due to an NVML polling bug in the training environment β training functioned correctly despite this.
- Evaluated only on synthetic benchmarks. Real-world performance on live security operations tasks has not been measured.
Ethical Considerations
This model is trained to analyze dual-use commands β commands that have both legitimate administrative uses and potential malicious applications. It is designed for defensive security operations β helping analysts understand whether observed commands are likely benign or malicious in context.
The model should not be used to generate offensive security tooling, attack automation, or to assist in unauthorized access to systems. All training data was synthesized under operator oversight with explicit content governance controls.
Citation
@misc{cyberops-mistral-7b-lora-2026,
title = {CyberOps-Mistral-7B-LoRA: A LoRA Fine-Tuned Adapter for Cybersecurity IT Operations Reasoning},
year = {2026},
note = {LoRA adapter for Mistral 7B Instruct v0.3, trained on 3329 synthetic cybersecurity records
across a three-epoch curriculum covering tool correctness, ambiguity reasoning,
failure diagnosis, security scope analysis, and cross-shell translation.}
}
Model Card Contact
Training environment: Alienware M16R2 / RTX 4070 Laptop GPU / Windows 11.
- Downloads last month
- 43
Model tree for dpevzner/CyberOps_Mistral_7B_LoRA
Base model
mistralai/Mistral-7B-v0.3