Ornstein-27B-GGUF

GGUF quantizations of DJLougen/Ornstein-27B — a reasoning-focused fine-tune of Qwen 3.5 27B trained on 1,229 high-quality reasoning traces curated through a custom Drift Diffusion Modeling (DDM) pipeline.

Support This Work

I'm a PhD student in visual neuroscience at the University of Toronto who also happens to spend way too much time fine-tuning, merging, and quantizing open-weight models on rented H100s and a local DGX Spark. All training compute is self-funded — balancing GPU costs against a student budget. If my uploads have been useful to you, consider buying a PhD student a coffee. It goes a long way toward keeping these experiments running.

Support on Ko-fi



What Makes Ornstein Different

Unlike typical reasoning fine-tunes that use large volumes of synthetic data, Ornstein implements quality-over-quantity:

  • Detects degenerate reasoning: Identifies "fake" reasoning that mimics thought without substance (hedging, restating, circling)
  • Premium vs. Degenerate split: 799 premium traces + 430 selected degenerate traces = 1,229 total
  • DDM AUC of 0.9705 separating premium from degenerate reasoning with 99.49% sensitivity

The model uses <think>...</think> blocks for extended multi-phase reasoning with self-correction and verification before providing final answers.


Available Quantizations

Quantization Size Use Case
F16 53.8 GB Full precision, no quality loss
Q8_0 28.6 GB Near-lossless, good for high-end consumer GPUs
Q6_K 22.1 GB High quality
Q5_K_M 19.2 GB Good balance
Q5_K_S 18.7 GB Lighter variant
Q4_K_M 16.5 GB Recommended — strong quality/size tradeoff
IQ4_XS 11.6 GB Efficient 4-bit
Q3_K_L 14.3 GB Lighter 3-bit
Q3_K_M 13.3 GB Mid 3-bit
Q3_K_S 12.1 GB Light 3-bit
Q2_K 10.7 GB Minimal footprint

Quick Start

llama.cpp

# Download a quantization (example: Q4_K_M)
huggingface-cli download DJLougen/Ornstein-27B-GGUF ornstein-27b-q4_k_m.gguf --local-dir .

# Run with llama.cpp
./llama-cli -m ornstein-27b-q4_k_m.gguf \
  -p "You are a helpful reasoning assistant." \
  --temp 0.6 -n 8192

Ollama

# Create a Modelfile
cat <<EOF > Modelfile
FROM ./ornstein-27b-q4_k_m.gguf
PARAMETER temperature 0.6
PARAMETER num_predict 8192
SYSTEM "You are a helpful reasoning assistant."
EOF

ollama create ornstein -f Modelfile
ollama run ornstein

LM Studio

  1. Download the desired quantization from the Files tab
  2. Load it in LM Studio
  3. Set context length to 8192 for full reasoning depth

Recommended Settings

Parameter Suggested Value
Temperature 0.6
Top-P 0.95
Max Tokens 8192
Repeat Penalty 1.1

Training Details

Parameter Value
Base Model unsloth/Qwen3.5-27B
Parameters 27B
Method LoRA (rank 32, alpha 32)
Dropout 0.05
Epochs 1
Learning Rate 1e-4 (cosine schedule, 10% warmup)
Max Sequence Length 8192
Micro Batch Size 1
Gradient Accumulation 4 steps
Weight Decay 0.01
LoRA Targets q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Framework Unsloth

Data Quality Metrics

Metric Value
Total Examples 1,229
Mean Thinking Depth ~1,667 words
Self-correction Present 100% of traces
Verification Present 100% of traces
Exploration Present 100% of traces
Quality Gate Pass Rate 100%

Training Data Profile

  • Category Mix: Math (1,016), Code (124), Science (45), Logic (44)
  • Reasoning Depth: Premium traces average ~1,263 words of thinking vs ~281 for degenerate traces
  • Drift Score Threshold: 1.463 cleanly separates premium from degenerate traces
  • DDM AUC: 0.9705 | Sensitivity: 99.49% | False Positive Rate: ~5%

Intended Use

Designed for tasks requiring structured, multi-step reasoning:

  • Mathematics
  • Logic problems
  • Code analysis
  • Scientific problems
  • Complex question answering

Limitations

  • Single epoch training on 1,229 examples means the model retains most base Qwen 3.5 27B behavior; the fine-tune primarily shapes reasoning style rather than injecting new knowledge
  • Language scope: DDM pipeline optimized for English; other languages reflect base model performance
  • Edge cases: Extended thinking can occasionally loop on adversarial or highly ambiguous prompts

Citation

@misc{ornstein27b,
  author = {DJLougen},
  title = {Ornstein-27B: DDM-Curated Reasoning Fine-Tune of Qwen 3.5 27B},
  year = {2026},
  publisher = {Hugging Face},
  url = {https://huggingface.co/DJLougen/Ornstein-27B}
}

Links

Downloads last month
437
GGUF
Model size
27B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for DJLougen/Ornstein-27B-GGUF

Base model

Qwen/Qwen3.5-27B
Quantized
(3)
this model