Qwen2.5-1.5B-QLoRA-Recipe

A domain-adapted language model fine-tuned for culinary recipe generation using QLoRA

Model Description

This model is a fine-tuned version of Qwen/Qwen2.5-1.5B-Instruct, specialized for generating cooking recipes from ingredient lists. Using QLoRA (Quantized Low-Rank Adaptation), the model achieves significant improvements in recipe generation quality while retaining 97% of its general language understanding capabilities.

Key Results

Metric	Base Model	Fine-Tuned	Improvement
ROUGE-1	30.5%	48.5%	+59%
ROUGE-2	7.1%	28.3%	+299%
ROUGE-L	16.8%	39.2%	+134%
MMLU	70.7%	68.7%	97.2% retained
HellaSwag	66.0%	64.0%	97.0% retained

Model Details

Developed by: Daniel Krasik
Model type: Causal Language Model (Fine-tuned with QLoRA)
Language: English
License: CC BY-NC-SA 4.0
Fine-tuned from: Qwen/Qwen2.5-1.5B-Instruct
Parameters: 1.5B (base) + ~8M (LoRA adapters)

Model Sources

Repository: GitHub - Qwen-QLoRA-Chef
Publication: Teaching a Language Model to Cook
LoRA Adapters Only: Daniel-Krasik/Qwen2.5-1.5B-QLoRA-Recipe-adapters

Uses

Direct Use

This model is designed for generating cooking recipes based on a list of ingredients. It produces structured recipes with:

Recipe title
Ingredient list
Step-by-step cooking directions

How to Use

from transformers import AutoModelForCausalLM, AutoTokenizer

# Load the model
model_id = "Daniel-Krasik/Qwen2.5-1.5B-QLoRA-Recipe"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")

# Create a prompt
messages = [
    {
        "role": "system",
        "content": "You will generate one cooking recipe. List all necessary ingredients and give detailed steps."
    },
    {
        "role": "user",
        "content": "Include ingredients: chicken breast, garlic, lemon, rosemary, olive oil"
    }
]

# Generate
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=500, do_sample=False)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Using LoRA Adapters Only

For more flexibility, you can load just the LoRA adapters:

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

# Load base model
base_model = "Qwen/Qwen2.5-1.5B-Instruct"
adapter_id = "Daniel-Krasik/Qwen2.5-1.5B-QLoRA-Recipe-adapters"

tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(base_model, device_map="auto")
model = PeftModel.from_pretrained(model, adapter_id)

Out-of-Scope Use

This model is not suitable for:

Medical or nutritional advice
Food safety recommendations
Recipes requiring precise dietary calculations
Non-English recipe generation
Commercial use without proper licensing

Training Details

Training Data

The model was fine-tuned on the recipe-nlg-llama2 dataset, containing approximately 2 million recipes with structured fields:

Field	Description
`title`	Recipe name
`NER`	Named entities (ingredients)
`ingredients`	Full ingredient list with quantities
`directions`	Step-by-step cooking instructions

Data Splits:

Training: ~2,000,000 recipes
Validation: 200 samples
Test: 200 samples

Training Procedure

QLoRA Configuration

# Quantization Settings
load_in_4bit: true
bnb_4bit_quant_type: nf4
bnb_4bit_use_double_quant: true
bnb_4bit_compute_dtype: bfloat16

# LoRA Architecture
lora_r: 16
lora_alpha: 32
lora_dropout: 0.1
target_modules: ["q_proj", "v_proj"]

Training Hyperparameters

Training regime: bf16 mixed precision
Epochs: 1
Max steps: 300
Learning rate: 2e-4
Batch size: 4
Gradient accumulation steps: 4
Effective batch size: 16
Sequence length: 512
LR scheduler: Cosine
Warmup steps: 50
Optimizer: paged_adamw_8bit

Training Infrastructure

Specification	Value
GPU	NVIDIA A100 (40GB)
Training Time	~1 hour
Peak Memory	~12GB
Platform	RunPod

Evaluation

Metrics

The model was evaluated using multiple metrics to ensure comprehensive assessment:

ROUGE Scores - Lexical overlap with reference recipes
BERTScore - Semantic similarity using contextual embeddings
MMLU - General language understanding (forgetting check)
HellaSwag - Commonsense reasoning (forgetting check)

Results

Task Performance (Recipe Generation)

Metric	Base Model	Fine-Tuned	Relative Improvement
ROUGE-1	30.5%	48.5%	+59%
ROUGE-2	7.1%	28.3%	+299%
ROUGE-L	16.8%	39.2%	+134%

Semantic Quality (BERTScore)

Metric	Base Model	Fine-Tuned	Change
Precision	0.817	0.815	-0.002
Recall	0.857	0.856	-0.001
F1	0.836	0.835	-0.001

Near-identical BERTScores confirm the model learned format/vocabulary alignment without semantic drift.

Knowledge Retention

Benchmark	Base Model	Fine-Tuned	Retention
MMLU	70.7%	68.7%	97.2%
HellaSwag	66.0%	64.0%	97.0%

The model retained 97% of its general capabilities while gaining significant domain expertise.

Example Outputs

Example 1: Chicken Penne

Input: chicken breast, marinara sauce, penne noodles, Merlot wine, olive oil, water, cilantro

Generated Output:

Chicken Penne

Ingredients:

chicken breast

marinara sauce

penne noodles

Merlot wine

olive oil

water

cilantro

Directions:

Preheat oven to 375 degrees F (190 degrees C). Place the chicken breasts in a large baking dish; brush with olive oil and sprinkle with salt and pepper. Bake until cooked through, about 20 minutes.

Meanwhile, cook the pasta according to package directions. Drain well.

In a small bowl, whisk together the marinara sauce, wine, and water. Pour over the chicken and toss to coat. Add the pasta and toss to combine. Sprinkle with cilantro leaves and serve hot.

Example 2: Potato Frittata

Input: olive oil, potatoes, onion, eggs, salt, pepper, flat-leaf parsley

Generated Output:

Potato Frittata

Directions:

Heat the oil in a large skillet over medium heat; add the potatoes and cook until tender, about 10 minutes.

Add the onion and cook until soft, about 5 minutes.

Remove from heat and let cool slightly.

In a bowl, whisk together the eggs, salt, and pepper.

Stir in the cooled potato mixture.

Pour into a greased 9-inch baking dish.

Bake at 375°F for 20 to 25 minutes or until set.

Limitations

English Only: Trained exclusively on English recipes
Evaluation Size: 200-sample test set may not capture all edge cases
Temporal Knowledge: Base model knowledge cutoff affects contemporary culinary trends
No Nutritional Accuracy: Not validated for dietary/nutritional information

Environmental Impact

Hardware: NVIDIA A100 (40GB)
Training Time: ~1 hour
Cloud Provider: RunPod
Estimated Carbon: Minimal due to efficient QLoRA training

Citation

@software{qwen_qlora_chef_2025,
  author = {Daniel Krasik},
  title = {Qwen-QLoRA-Chef: Fine-tuning Qwen2.5 for Recipe Generation using QLoRA},
  year = {2025},
  publisher = {GitHub},
  url = {https://github.com/danielkrasik3010/Qwen-QLoRA-Chef},
  note = {A complete pipeline for domain-adaptive language model fine-tuning}
}

Model Card Author

Daniel Krasik

GitHub: https://github.com/danielkrasik3010
Hugging Face: https://huggingface.co/Daniel-Krasik
LinkedIn: https://www.linkedin.com/in/daniel-krasik-590087383/

Acknowledgments

Qwen Team for the excellent base model
Hugging Face for Transformers, PEFT, and Datasets libraries
bitsandbytes for quantization support
Recipe NLG for the training dataset
RunPod for cloud GPU infrastructure

Downloads last month: 5

Safetensors

Model size

2B params

Tensor type

F32

Model tree for Daniel-Krasik/Qwen2.5-1.5B-QLoRA-Recipe

Base model

Qwen/Qwen2.5-1.5B

Finetuned

Qwen/Qwen2.5-1.5B-Instruct

Quantized

(137)

this model

Daniel-Krasik
/

Qwen2.5-1.5B-QLoRA-Recipe