Model trained on Claude Sonnet 4.6 Adaptive Thinking (The claude.ai Claude, not the API Claude). The training data is PRIVATE.
LFM2.5-1.2B-Distilled-Claude-4.6 (Liquid Claude)
LFM2.5-1.2B-Distilled-Claude-4.6 (Liquid Claude) is a distillation of Claude into LFM2.5-1.2B-Thinking via LoRA.
Training data info:
THINK BLOCK PATTERNS
Agentic think blocks (Action/Observation): 8473
Pure reasoning think blocks: 3005
CONSECUTIVE USER MESSAGES
Total consecutive user-user pairs: 319
MESSAGE LENGTH STATS (chars)
| :) | msgs | avg | max |
|---|---|---|---|
| System | 2325 | 2168 | 2168 |
| User | 11919 | 332 | 39134 |
| Assistant | 11738 | 4386 | 264340 |
Assistant messages total: 11738
With agentic tool calls in think: 2151 (18.3%)
Total chars in dataset: 60,487,999
Approx tokens (~4 chars/token): 15,121,999
Conversations with <=2 messages (system+1): 121
Conversations with >5 think blocks in a single assistant msg: 319
Use model
from transformers import pipeline
pipe = pipeline("text-generation", model="FlameF0X/LFM2.5-1.2B-Distilled-Claude-4.6")
messages = [
{"role": "system", "content": "You are a helpful assistant."}, # I RECOMMEND TO KEEP THIS FOR STABILITY! But you can change the system.
{"role": "user", "content": "Who are you?"},
]
pipe(messages)
Sample chat:
(Ignore the fact that it took 1min to reason, i got a i3-6006u / 12GB as hardware and running the f16 quantization)
Benchmark
The results are in progress.
- Downloads last month
- 1,383