Instructions to use EMD123/tiny-aya-kosher-3.3B-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use EMD123/tiny-aya-kosher-3.3B-GGUF with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="EMD123/tiny-aya-kosher-3.3B-GGUF",
	filename="tiny-aya-kosher-3.3B-F16.gguf",
)

llm.create_chat_completion(
	messages = "No input example has been defined for this model task."
)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use EMD123/tiny-aya-kosher-3.3B-GGUF with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf EMD123/tiny-aya-kosher-3.3B-GGUF:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf EMD123/tiny-aya-kosher-3.3B-GGUF:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf EMD123/tiny-aya-kosher-3.3B-GGUF:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf EMD123/tiny-aya-kosher-3.3B-GGUF:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf EMD123/tiny-aya-kosher-3.3B-GGUF:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf EMD123/tiny-aya-kosher-3.3B-GGUF:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf EMD123/tiny-aya-kosher-3.3B-GGUF:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf EMD123/tiny-aya-kosher-3.3B-GGUF:Q4_K_M

Use Docker

docker model run hf.co/EMD123/tiny-aya-kosher-3.3B-GGUF:Q4_K_M

LM Studio
Jan
Ollama
How to use EMD123/tiny-aya-kosher-3.3B-GGUF with Ollama:
```
ollama run hf.co/EMD123/tiny-aya-kosher-3.3B-GGUF:Q4_K_M
```

Unsloth Studio new

How to use EMD123/tiny-aya-kosher-3.3B-GGUF with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for EMD123/tiny-aya-kosher-3.3B-GGUF to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for EMD123/tiny-aya-kosher-3.3B-GGUF to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for EMD123/tiny-aya-kosher-3.3B-GGUF to start chatting

Docker Model Runner
How to use EMD123/tiny-aya-kosher-3.3B-GGUF with Docker Model Runner:
```
docker model run hf.co/EMD123/tiny-aya-kosher-3.3B-GGUF:Q4_K_M
```

Lemonade

How to use EMD123/tiny-aya-kosher-3.3B-GGUF with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull EMD123/tiny-aya-kosher-3.3B-GGUF:Q4_K_M

Run and chat with the model

lemonade run user.tiny-aya-kosher-3.3B-GGUF-Q4_K_M

List all available models

lemonade list

tiny-aya-kosher-3.3B-GGUF: מודל שפה מותאם לערכי הצניעות וההלכה

Model Details

Model Description

tiny-aya-kosher-3.3B-GGUF הוא מודל שפה ממוזג (Merged) המבוסס על Tiny-Aya-Global. המודל עבר כוונון עדין (Fine-tuning) .ממוקד כדי להתאים את תגובותיו לערכי הציבור החרדי, תוך דגש על סינון תכנים שאינם הולמים, מניעת עיסוק בנושאי כפירה, עבודה זרה, ושמירה על גדרי הצניעות .

Developed by: EMD123
Model type: Causal Language Model (Fine-tuned with QLoRA)
Language(s) (NLP): Hebrew (Primary), English
License: CC-BY-NC-4.0 (Non-Commercial use only)
Finetuned from model: CohereLabs/tiny-aya-global

Uses

Direct Use

המודל נועד לשמש כעוזר בינה מלאכותית "כשר". הוא מתאים לשימוש במערכות המיועדות למשתמשים שומרי תורה ומצוות המעוניינים בכלי עבודה חכם שאינו נחשף לתכנים אסורים או שאינם הולמים את רוח הקהילה.

Out-of-Scope Use

אין להשתמש במודל לצרכים מסחריים (בהתאם לרישיון ה-NC). המודל אינו מיועד לספק פסיקה הלכתית רשמית או ייעוץ רוחני, אלא לשמש ככלי עזר טכנולוגי בלבד.

Bias, Risks, and Limitations

למרות האימון הממוקד, מודלי שפה עלולים להזות (Hallucinate) או לעקוף מגבלות בסיטואציות מסוימות. המודל הותאם לסרב לתכנים מסוימים, אך ייתכנו מקרים של סירוב-יתר (False Positive) גם לשאלות תמימות אם הן מזכירות מילים רגישות.

Recommendations

מומלץ להשתמש בטמפרטורה (Temperature) נמוכה (0.1-0.3) כדי לקבל תשובות עקביות ומדויקות יותר מבחינת ערכי הסינון.

How to Get Started with the Model

כדי להפעיל את המודל בצורה נכונה, יש להשתמש בתבנית הצ'אט הרשמית של Aya:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "[EMD123]/tiny-aya-kosher-3.3B-GGUF"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", torch_dtype="auto")

messages = [{"role": "user", "content": "האם תוכל לעזור לי בכתיבת מכתב רשמי?"}]
input_ids = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to("cuda")

outputs = model.generate(input_ids, max_new_tokens=256, temperature=0.2, do_sample=True)
print(tokenizer.decode(outputs[0][input_ids.shape[-1]:], skip_special_tokens=True))

Training Details

Training Data

המודל אומן על דאטה-סט ייעודי שנבנה ידנית וכולל כ-520 דוגמאות:

דוגמאות סינון: הנחיות לסירוב מנומס ומנומק לנושאים שאינם צנועים, כפירה, ודתות אחרות.

דוגמאות שימור: שאלות בידע כללי, קוד ושפה כדי לשמר את יכולות הליבה של המודל.

שיחות רב-סיבוביות (Multi-turn): דוגמאות המלמדות את המודל לשמור על עקביות לאורך שיחה שלמה.

Training Procedure

האימון התבצע בשיטת QLoRA (4-bit quantization) על גבי כרטיס מסך NVIDIA T4 בסביבת Google Colab.

Training regime: bf16 mixed precision

Learning Rate: 2e-4

Epochs: 2 (Early stopping applied to prevent overfitting)

Batch Size: 2 (Gradient Accumulation Steps: 4)

Technical Specifications

Model Architecture and Objective

המודל מבוסס על ארכיטקטורת Command-R של Cohere, המותאמת במיוחד לביצועים רב-לשוניים יעילים במודל קומפקטי (3B).

License & Policy

מודל זה כפוף לרישיון Creative Commons Attribution-NonCommercial 4.0 International. בנוסף, המשתמשים מחויבים לציות למדיניות השימוש של Cohere Lab (Acceptable Use Policy).

More Information

המודל נוצר מתוך צורך חיוני בכלים טכנולוגיים מתקדמים המכבדים את עולמם הערכי של המשתמשים הדתיים והחרדים.

Downloads last month: 29

GGUF

Model size

3B params

Architecture

cohere2

Hardware compatibility

3-bit

4-bit

8-bit

16-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for EMD123/tiny-aya-kosher-3.3B-GGUF

Base model

CohereLabs/tiny-aya-base

Finetuned

CohereLabs/tiny-aya-global

Quantized

(7)

this model

EMD123
/

tiny-aya-kosher-3.3B-GGUF