saurabh5 commited on
Commit
9090ad1
·
verified ·
1 Parent(s): d7dde56

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +48 -0
README.md CHANGED
@@ -150,6 +150,54 @@ Moo Moo the cow would certinaly win.
150
  - reinforcement learning from verifiable rewards on the Dolci-Think-RL-7B dataset. This dataset consits of math, code, instruction-following, and general chat queries.
151
  - Datasets: [Dolci-Think-RL-7B](https://huggingface.co/datasets/allenai/Dolci-Think-RL-7B), [Dolci-Instruct-RL-7B](https://huggingface.co/datasets/allenai/Dolci-Instruct-RL-7B)
152
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
153
 
154
  ## Bias, Risks, and Limitations
155
  Like any base language model or fine-tuned model without safety filtering, these models can easily be prompted by users to generate harmful and sensitive content. Such content may also be produced unintentionally, especially in cases involving bias, so we recommend that users consider the risks when applying this technology. Additionally, many statements from OLMo or any LLM are often inaccurate, so facts should be verified.
 
150
  - reinforcement learning from verifiable rewards on the Dolci-Think-RL-7B dataset. This dataset consits of math, code, instruction-following, and general chat queries.
151
  - Datasets: [Dolci-Think-RL-7B](https://huggingface.co/datasets/allenai/Dolci-Think-RL-7B), [Dolci-Instruct-RL-7B](https://huggingface.co/datasets/allenai/Dolci-Instruct-RL-7B)
152
 
153
+ ## Inference & Recommended Settings
154
+ We evaluated our models on the following settings. We also recommend using them for generation:
155
+ - **temperature:** `0.6`
156
+ - **top_p:** `0.95`
157
+ - **max_tokens:** `32768`
158
+
159
+ ### transformers Example
160
+ ```python
161
+ from transformers import AutoModelForCausalLM, AutoTokenizer
162
+
163
+ model_id = "allenai/Olmo-3-7B-Instruct-DPO"
164
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
165
+ model = AutoModelForCausalLM.from_pretrained(
166
+ model_id,
167
+ device_map="auto",
168
+ )
169
+
170
+ prompt = "Who would in in a fight - a dinosaur of a cow named MooMoo?"
171
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
172
+
173
+ outputs = model.generate(
174
+ **inputs,
175
+ temperature=0.6,
176
+ top_p=0.95,
177
+ max_new_tokens=32768,
178
+ )
179
+
180
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
181
+ ```
182
+
183
+ ### vllm Example
184
+ ```python
185
+ from vllm import LLM, SamplingParams
186
+
187
+ model_id = "allenai/Olmo-3-7B-Instruct-DPO"
188
+ llm = LLM(model=model_id)
189
+
190
+ sampling_params = SamplingParams(
191
+ temperature=0.6,
192
+ top_p=0.95,
193
+ max_tokens=32768,
194
+ )
195
+
196
+ prompt = "Who would in in a fight - a dinosaur of a cow named MooMoo?"
197
+
198
+ outputs = llm.generate(prompt, sampling_params)
199
+ print(outputs[0].outputs[0].text)
200
+ ```
201
 
202
  ## Bias, Risks, and Limitations
203
  Like any base language model or fine-tuned model without safety filtering, these models can easily be prompted by users to generate harmful and sensitive content. Such content may also be produced unintentionally, especially in cases involving bias, so we recommend that users consider the risks when applying this technology. Additionally, many statements from OLMo or any LLM are often inaccurate, so facts should be verified.