Model-to-Production v2.2
A fine-tuned Gemma 3 1B Instruct model for Swedish binary text classification.
Model Description
This model is a merged LoRA fine-tuned version of google/gemma-3-1b-it for binary text classification on Swedish text data.
- Model type: Sequence Classification (Gemma3TextForSequenceClassification)
- Language: Swedish (sv)
- Base model: google/gemma-3-1b-it
- Fine-tuning method: LoRA (Low-Rank Adaptation)
- Task: Binary text classification
Performance
| Metric | Value |
|---|---|
| Accuracy | 97.26% |
| F1 Score (macro) | 97.26% |
| Eval Loss | 0.067 |
Training Details
Training Hyperparameters
| Parameter | Value |
|---|---|
| Base model | google/gemma-3-1b-it |
| Learning rate | 5e-05 |
| Batch size | 8 |
| Gradient accumulation steps | 2 |
| Effective batch size | 16 |
| Epochs | 1 |
| Max sequence length | 512 |
| Warmup ratio | 0.1 |
| Optimizer | AdamW |
| Precision | bfloat16 |
| Flash Attention | Enabled |
LoRA Configuration
| Parameter | Value |
|---|---|
| LoRA rank (r) | 32 |
| LoRA alpha | 64 |
| LoRA dropout | 0.05 |
| Target modules | q_proj, k_proj, v_proj, o_proj, up_proj, down_proj, gate_proj |
Training Data
- Training samples: ~63,000
- Validation samples: Separate validation set (val_small.csv)
- Data augmentation: Unicode normalization, random punctuation removal, character swaps, edge cutting
Usage
With Transformers Pipeline
from transformers import pipeline
classifier = pipeline(
"text-classification",
model="Mohamad-Jaallouk/model-to-production-2.2",
device="cuda" # or "cpu"
)
result = classifier("Din svenska text här")
print(result)
Direct Model Loading
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
model_id = "Mohamad-Jaallouk/model-to-production-2.2"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="cuda"
)
text = "Din svenska text här"
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
inputs = {k: v.to(model.device) for k, v in inputs.items()}
with torch.no_grad():
outputs = model(**inputs)
prediction = torch.argmax(outputs.logits, dim=-1).item()
print(f"Predicted class: {prediction}")
Files Included
model.safetensors- Merged model weightsconfig.json- Model configurationtokenizer.json- Tokenizertokenizer_config.json- Tokenizer configurationtrain.py- Training scripttransform.py- Data augmentation utilitiesrequirements.txt- Python dependenciestrain.csv- Training dataval_small.csv- Validation data
Preprocessing
Text is preprocessed with the following transformations:
- Unicode NFC normalization
- Lowercasing
- Whitespace normalization
For training, additional augmentation is applied:
- Random punctuation removal
- Random character replacement/injection
- Adjacent character swapping
- Edge cutting
License
This model inherits the Gemma license from the base model.
Citation
If you use this model, please cite:
@misc{model-to-production-2.2,
author = {Mohamad Jaallouk},
title = {Model-to-Production v2.2: Swedish Text Classification},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/Mohamad-Jaallouk/model-to-production-2.2}}
}
- Downloads last month
- 68
Model tree for Mohamad-Jaallouk/model-to-production-2.2
Evaluation results
- Accuracyself-reported0.973
- F1 (macro)self-reported0.973