dicta-il
/

DictaLM-3.0-24B-Thinking

+---
+license: apache-2.0
+pipeline_tag: text-generation
+language:
+  - en
+  - he
+tags:
+- pretrained
+inference:
+  parameters:
+    temperature: 0.6
+---
+[<img src="https://i.ibb.co/5Lbwyr1/dicta-logo.jpg" width="300px"/>](https://dicta.org.il)
+# Dicta-LM 3.0: Advancing The Frontier of Hebrew Sovereign LLMs
+Dicta-LM 3.0 is a powerful open-weight collection of LLMs, trained on extensive corpora of Hebrew and English texts. The models are available for download and for unlimited use. The models set a new SOTA for their weight-class for Hebrew, both as base models and chat models.
+This is our flagship model, a 24-billion-parameter *reasoning* model, with full precision (BF16), originally initialized from [Mistral-Small-3.1-24B-Base-2503](https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Base-2503).
+This model is a reasoning chat model, which means that before responding to any given message from the user, the model first thinks out the right way to respond in a designated thinking block.
+<br/>
+&#128640; Try it out here: [chat.dicta.org.il](https://chat.dicta.org.il)
+<br/>
+For full details of this model please read our [release blog post](https://dicta.org.il/dicta-lm-3) or the [technical report](https://www.dicta.org.il/publications/DictaLM_3_0___Techincal_Report.pdf).
+You can view and access the full collection of base/instruct unquantized/quantized versions of `DictaLM 3.0` [here](https://huggingface.co/collections/dicta-il/dictalm-30-collection).
+## Instruction format
+In order to leverage instruction fine-tuning, your prompt should be rendered using the chat template specified for this model. Most libraries deal with this automatically, so you can just let them do it.
+## Usage
+We recommend using vLLM, but you can use Transformers as well:
+### Transformers
+```python
+from transformers import pipeline
+generator = pipeline('text-generation', model="dicta-il/DictaLM-3.0-24B-Thinking")
+messages = [
+    {"role": "user", "content": "איזה רוטב אהוב עליך?"},
+    {"role": "assistant", "content": "טוב, אני די מחבב כמה טיפות מיץ לימון סחוט טרי. זה מוסיף בדיוק את הכמות הנכונה של טעם חמצמץ לכל מה שאני מבשל במטבח!"},
+    {"role": "user", "content": "האם יש לך מתכונים למיונז?"}
+]
+print(generator(messages)[0]['generated_text'][-1]) # just print the last message
+#
+```
+### vLLM
+```bash
+vllm serve dicta-il/DictaLM-3.0-24B-Thinking --enable-auto-tool-choice --tool-call-parser hermes --reasoning_parser deepseek_r1
+```
+And then you can access it via the openai library:
+```python
+from openai import OpenAI
+client = OpenAI(
+    base_url="http://localhost:8000/v1",
+    api_key="sk-no-key-required"
+)
+response = client.chat.completions.create(
+    model="dicta-il/DictaLM-3.0-24B-Thinking",
+    messages=[
+        {"role": "user", "content": "Hello, how are you?"}
+    ],
+)
+print(response.choices[0].message.content)
+```
+> The reasoning traces should be available in the response structure in the designated fild.
+The model supports tool-calling, enabling integration with external tools and APIs. For example how to use the tool calling, see the [vLLM documentation](https://docs.vllm.ai/en/stable/features/tool_calling/#tool-calling).
+## Citation
+If you use this model, please cite:
+```bibtex
+@article{Shmidman2025DictaLM3,
+  title={{Dicta-LM 3.0: Advancing The Frontier of Hebrew Sovereign LLMs}},
+  author={Shaltiel Shmidman and Avi Shmidman and Amir DN Cohen and Moshe Koppel},
+  year={2025},
+  publisher={{DICTA / Jerusalem, Israel}},
+  note={https://www.dicta.org.il/publications/DictaLM_3_0___Techincal_Report.pdf}
+}
+```