trentmkelly
/

Qwen3-14B-MechaStalin

Text Generation

political-rewriting

text-generation-inference

Model card Files Files and versions

trentmkelly commited on Jul 14

Commit

242817c

·

verified ·

1 Parent(s): 915d88b

Create README.md

Files changed (1) hide show

README.md +33 -0

README.md ADDED Viewed

	@@ -0,0 +1,33 @@

+---
+license: apache-2.0
+base_model: Qwen/Qwen3-14B
+tags:
+- peft
+- lora
+- grpo
+- political-rewriting
+- fine-tuned
+library_name: transformers
+pipeline_tag: text-generation
+---
+# Qwen3-14B-MechaStalin
+This model is a fine-tuned version of [Qwen/Qwen3-14B](https://huggingface.co/Qwen/Qwen3-14B) using GRPO, using the RULER reward system, to encourage left-wing beliefs.
+Like this model? Be sure to check out its cousin, [MechaHitler](https://huggingface.co/trentmkelly/Qwen3-14B-MechaHitler).
+## Training Details
+- **Base Model**: Qwen/Qwen3-14B
+- **Training Method**: GRPO with LoRA adapters
+- LoRA rank: 32
+- LoRA alpha: 32
+- Learning rate: 2e-5
+- Batch size: 2 (per device) × 4 (grad accumulation) = 8 effective
+- Generations per prompt: 8
+- Max completion length: 2048 tokens
+## Disclaimer
+This model was trained for research purposes to study political bias in text generation. Use responsibly and be aware of potential biases in outputs.