jhu-clsp/jfleg
Viewer • Updated • 1.5k • 2.2k • 76
A style-preserving grammar correction model based on SmolLM-135M, trained with SFT + DPO to make minimal, targeted corrections while preserving your original writing style.
Unlike large language models (GPT, Claude, etc.) that tend to rewrite entire sentences, this model makes minimal, targeted corrections - fixing only grammatical errors while preserving your vocabulary, tone, and voice. Perfect for:
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("DanJZY/SmolLM-135M-GEC-SFT-DPO")
tokenizer = AutoTokenizer.from_pretrained("DanJZY/SmolLM-135M-GEC-SFT-DPO")
text = "As the number of people grows, the need of habitable environment is essential."
inputs = tokenizer(f"Fix grammar: {text}", return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Original (with error):
"As the number of people grows, the need of habitable environment is essential."
✅ Our Model (Style-Preserving):
"As the number of people grows, the need for a habitable environment is essential."
↑
Only fixes "of" → "for a"
❌ Typical Model (Over-Correction):
"As population growth continues, the necessity for a habitable environment becomes essential."
↑
Completely rewrites: changes vocabulary, structure, and tone
| Parameter | Value |
|---|---|
| Base model | SmolLM-135M |
| Training method | SFT + DPO (Direct Preference Optimization) |
| Preference pairs | ~19,000 (generated using edit distance) |
| Total experiments | 28 (22 SFT + 6 DPO/IPO) |
| Hardware | 8x RTX 3090 |
| Training time | ~3 hours |
| Resource | Link |
|---|---|
| GitHub Repository | ZhuoyuanJiang/SmolLM-GEC-SFT-DPO |
| Full Experiment Checkpoints | Google Drive (~68GB) |
| Training Notebooks | GitHub notebooks/ |
@misc{smollm_gec_sft_dpo_2025,
title={SmolLM-135M-GEC-SFT-DPO: Style-Preserving Grammar Correction with Direct Preference Optimization},
author={Zhuoyuan Jiang},
year={2025},
url={https://huggingface.co/DanJZY/SmolLM-135M-GEC-SFT-DPO},
note={Fine-tuned SmolLM-135M for minimal, style-preserving grammatical error correction}
}
Special thanks to Nima Tajbakhsh (Nvidia) for guidance on efficient training methods.
Base model
HuggingFaceTB/SmolLM-135M