Part of the Outlier shipping lineup. Outlier is a free macOS app that runs this model locally, with one click. Apple Silicon only.
Outlier Quick 26B-A4B (MLX 4-bit)
Sparse MoE tier (26B params, ~4B active per token). Sits between Lite and Core in latency, with stronger thinking-mode reasoning. Optimized for general chat and reasoning, not for code generation.
Try it in Outlier
The simplest way to use this model is through the Outlier app — open the tier picker, select Outlier Quick, click download, and chat. No setup, no Python, no MLX install, no token quotas.
➡ Download Outlier — outlier.host
A screenshot of the tier picker is at outlier.host/screenshots/tier-picker.png.
Load this directly (power users)
If you want the raw MLX-4bit weights without the app:
pip install mlx-lm
python -m mlx_lm.generate \
--model Outlier-Ai/Outlier-Quick-26B-MLX-4bit \
--prompt "Write a quicksort in Python." \
--max-tokens 512
from mlx_lm import load, generate
model, tokenizer = load("Outlier-Ai/Outlier-Quick-26B-MLX-4bit")
print(generate(model, tokenizer, prompt="Hello", max_tokens=256))
Verified benchmarks
For σ-qualified MMLU, HumanEval, and Mac inference-speed numbers — with full provenance (source file, command, n, stderr, date) — see outlier.host/benchmarks.
Other Outlier shipping tiers
- Outlier Nano 4B (entry tier, ~3 GB)
- Outlier Lite 9B (balanced, ~6 GB)
- Outlier Core 27B (default, ~16 GB)
- Outlier Code 27B (code-tuned, ~16 GB)
- Outlier Vision 35B-A3B (multimodal, ~20 GB)
License
Apache 2.0 (inherits from upstream base model). Conversion artifact only — the underlying weights are governed by the base model's license.
- Downloads last month
- 69
4-bit
Evaluation results
- accuracy on MMLU (stratified n=300)test set self-reported0.793
- pass@1 on HumanEvaltest set self-reported0.128