Qwen2.5-1.5B-QLoRA-Recipe

A domain-adapted language model fine-tuned for culinary recipe generation using QLoRA

Model Description

This model is a fine-tuned version of Qwen/Qwen2.5-1.5B-Instruct, specialized for generating cooking recipes from ingredient lists. Using QLoRA (Quantized Low-Rank Adaptation), the model achieves significant improvements in recipe generation quality while retaining 97% of its general language understanding capabilities.

Key Results

Metric Base Model Fine-Tuned Improvement
ROUGE-1 30.5% 48.5% +59%
ROUGE-2 7.1% 28.3% +299%
ROUGE-L 16.8% 39.2% +134%
MMLU 70.7% 68.7% 97.2% retained
HellaSwag 66.0% 64.0% 97.0% retained

Model Details

  • Developed by: Daniel Krasik
  • Model type: Causal Language Model (Fine-tuned with QLoRA)
  • Language: English
  • License: CC BY-NC-SA 4.0
  • Fine-tuned from: Qwen/Qwen2.5-1.5B-Instruct
  • Parameters: 1.5B (base) + ~8M (LoRA adapters)

Model Sources


Uses

Direct Use

This model is designed for generating cooking recipes based on a list of ingredients. It produces structured recipes with:

  • Recipe title
  • Ingredient list
  • Step-by-step cooking directions

How to Use

from transformers import AutoModelForCausalLM, AutoTokenizer

# Load the model
model_id = "Daniel-Krasik/Qwen2.5-1.5B-QLoRA-Recipe"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")

# Create a prompt
messages = [
    {
        "role": "system",
        "content": "You will generate one cooking recipe. List all necessary ingredients and give detailed steps."
    },
    {
        "role": "user",
        "content": "Include ingredients: chicken breast, garlic, lemon, rosemary, olive oil"
    }
]

# Generate
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=500, do_sample=False)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Using LoRA Adapters Only

For more flexibility, you can load just the LoRA adapters:

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

# Load base model
base_model = "Qwen/Qwen2.5-1.5B-Instruct"
adapter_id = "Daniel-Krasik/Qwen2.5-1.5B-QLoRA-Recipe-adapters"

tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(base_model, device_map="auto")
model = PeftModel.from_pretrained(model, adapter_id)

Out-of-Scope Use

This model is not suitable for:

  • Medical or nutritional advice
  • Food safety recommendations
  • Recipes requiring precise dietary calculations
  • Non-English recipe generation
  • Commercial use without proper licensing

Training Details

Training Data

The model was fine-tuned on the recipe-nlg-llama2 dataset, containing approximately 2 million recipes with structured fields:

Field Description
title Recipe name
NER Named entities (ingredients)
ingredients Full ingredient list with quantities
directions Step-by-step cooking instructions

Data Splits:

  • Training: ~2,000,000 recipes
  • Validation: 200 samples
  • Test: 200 samples

Training Procedure

QLoRA Configuration

# Quantization Settings
load_in_4bit: true
bnb_4bit_quant_type: nf4
bnb_4bit_use_double_quant: true
bnb_4bit_compute_dtype: bfloat16

# LoRA Architecture
lora_r: 16
lora_alpha: 32
lora_dropout: 0.1
target_modules: ["q_proj", "v_proj"]

Training Hyperparameters

  • Training regime: bf16 mixed precision
  • Epochs: 1
  • Max steps: 300
  • Learning rate: 2e-4
  • Batch size: 4
  • Gradient accumulation steps: 4
  • Effective batch size: 16
  • Sequence length: 512
  • LR scheduler: Cosine
  • Warmup steps: 50
  • Optimizer: paged_adamw_8bit

Training Infrastructure

Specification Value
GPU NVIDIA A100 (40GB)
Training Time ~1 hour
Peak Memory ~12GB
Platform RunPod

Evaluation

Metrics

The model was evaluated using multiple metrics to ensure comprehensive assessment:

  1. ROUGE Scores - Lexical overlap with reference recipes
  2. BERTScore - Semantic similarity using contextual embeddings
  3. MMLU - General language understanding (forgetting check)
  4. HellaSwag - Commonsense reasoning (forgetting check)

Results

Task Performance (Recipe Generation)

Metric Base Model Fine-Tuned Relative Improvement
ROUGE-1 30.5% 48.5% +59%
ROUGE-2 7.1% 28.3% +299%
ROUGE-L 16.8% 39.2% +134%

Semantic Quality (BERTScore)

Metric Base Model Fine-Tuned Change
Precision 0.817 0.815 -0.002
Recall 0.857 0.856 -0.001
F1 0.836 0.835 -0.001

Near-identical BERTScores confirm the model learned format/vocabulary alignment without semantic drift.

Knowledge Retention

Benchmark Base Model Fine-Tuned Retention
MMLU 70.7% 68.7% 97.2%
HellaSwag 66.0% 64.0% 97.0%

The model retained 97% of its general capabilities while gaining significant domain expertise.


Example Outputs

Example 1: Chicken Penne

Input: chicken breast, marinara sauce, penne noodles, Merlot wine, olive oil, water, cilantro

Generated Output:

Chicken Penne

Ingredients:

  • chicken breast
  • marinara sauce
  • penne noodles
  • Merlot wine
  • olive oil
  • water
  • cilantro

Directions:

  1. Preheat oven to 375 degrees F (190 degrees C). Place the chicken breasts in a large baking dish; brush with olive oil and sprinkle with salt and pepper. Bake until cooked through, about 20 minutes.
  2. Meanwhile, cook the pasta according to package directions. Drain well.
  3. In a small bowl, whisk together the marinara sauce, wine, and water. Pour over the chicken and toss to coat. Add the pasta and toss to combine. Sprinkle with cilantro leaves and serve hot.

Example 2: Potato Frittata

Input: olive oil, potatoes, onion, eggs, salt, pepper, flat-leaf parsley

Generated Output:

Potato Frittata

Directions:

  1. Heat the oil in a large skillet over medium heat; add the potatoes and cook until tender, about 10 minutes.
  2. Add the onion and cook until soft, about 5 minutes.
  3. Remove from heat and let cool slightly.
  4. In a bowl, whisk together the eggs, salt, and pepper.
  5. Stir in the cooled potato mixture.
  6. Pour into a greased 9-inch baking dish.
  7. Bake at 375°F for 20 to 25 minutes or until set.

Limitations

  • English Only: Trained exclusively on English recipes
  • Evaluation Size: 200-sample test set may not capture all edge cases
  • Temporal Knowledge: Base model knowledge cutoff affects contemporary culinary trends
  • No Nutritional Accuracy: Not validated for dietary/nutritional information

Environmental Impact

  • Hardware: NVIDIA A100 (40GB)
  • Training Time: ~1 hour
  • Cloud Provider: RunPod
  • Estimated Carbon: Minimal due to efficient QLoRA training

Citation

@software{qwen_qlora_chef_2025,
  author = {Daniel Krasik},
  title = {Qwen-QLoRA-Chef: Fine-tuning Qwen2.5 for Recipe Generation using QLoRA},
  year = {2025},
  publisher = {GitHub},
  url = {https://github.com/danielkrasik3010/Qwen-QLoRA-Chef},
  note = {A complete pipeline for domain-adaptive language model fine-tuning}
}

Model Card Author

Daniel Krasik


Acknowledgments

  • Qwen Team for the excellent base model
  • Hugging Face for Transformers, PEFT, and Datasets libraries
  • bitsandbytes for quantization support
  • Recipe NLG for the training dataset
  • RunPod for cloud GPU infrastructure
Downloads last month
5
Safetensors
Model size
2B params
Tensor type
F32
·
U8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Daniel-Krasik/Qwen2.5-1.5B-QLoRA-Recipe

Base model

Qwen/Qwen2.5-1.5B
Quantized
(137)
this model

Dataset used to train Daniel-Krasik/Qwen2.5-1.5B-QLoRA-Recipe