Naholav
/

deep-think

@@ -31,9 +31,13 @@ This is the best performing checkpoint from the **deep_think** training configur
 - **Prompt Style:** Think (uses `<think>` tags for reasoning)
 - **System Prompt:** "You are an expert programmer. Use <think> tags for reasoning before writing code."
-- **LoRA Rank:** 64
-- **LoRA Alpha:** 128
-- **Learning Rate:** 2e-4
 ## All Models Performance Comparison

 - **Prompt Style:** Think (uses `<think>` tags for reasoning)
 - **System Prompt:** "You are an expert programmer. Use <think> tags for reasoning before writing code."
+- **LoRA Rank:** 32
+- **LoRA Alpha:** 64
+- **LoRA Dropout:** 0.05
+- **Learning Rate:** 5e-5
+**Note:** All 4 models were trained with identical hyperparameters for fair comparison. Better configurations may be discovered through hyperparameter search methods (e.g., grid search, random search).
 ## All Models Performance Comparison