Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

150

Full-text search

Active filters: RL

mradermacher/DiagAgent-7B-GGUF

8B • Updated Nov 2 • 130

mradermacher/Austral-70B-Winton-GGUF

71B • Updated Sep 6 • 422

mradermacher/Austral-70B-Winton-i1-GGUF

71B • Updated Sep 6 • 655

HYDARIM7/SmolLM2_RLHF_PPO_HY

Reinforcement Learning • 0.1B • Updated Sep 21 • 11

SII-Enigma/Qwen2.5-7B-Ins-AMPO

Text Generation • 2B • Updated Oct 15 • 18

SII-Enigma/Qwen2.5-7B-Ins-SFT-GRPO

Text Generation • 2B • Updated Oct 15 • 8

SII-Enigma/Llama3.2-8B-Ins-GRPO

Text Generation • 2B • Updated Oct 15 • 15 • 1

mradermacher/Llama3.2-8B-Ins-GRPO-GGUF

8B • Updated Oct 16 • 293 • 1

SII-Enigma/Qwen2.5-7B-Ins-SFT-AMPO

Text Generation • 8B • Updated Oct 15 • 8

SII-Enigma/Qwen2.5-7B-Ins-GRPO

Text Generation • 2B • Updated Oct 15 • 7

SII-Enigma/Qwen2.5-1.5B-Ins-AMPO

Text Generation • 2B • Updated Oct 15 • 9

SII-Enigma/Llama3.2-8B-Ins-AMPO

Text Generation • 8B • Updated Oct 15 • 7

SII-Enigma/Qwen2.5-1.5B-Ins-GRPO

Text Generation • 2B • Updated Oct 15 • 8

Ach0/GCPO-R1-1.5B

Text Generation • 2B • Updated Oct 11 • 14

mradermacher/GCPO-R1-1.5B-GGUF

2B • Updated Oct 11 • 128

mradermacher/GCPO-R1-1.5B-i1-GGUF

2B • Updated 4 days ago • 782

mradermacher/DeepHermes-Egregore-8B-131K-GGUF

Reinforcement Learning • 8B • Updated Oct 16 • 158 • 1

mradermacher/DeepHermes-Egregore-8B-131K-i1-GGUF

Reinforcement Learning • 8B • Updated about 4 hours ago • 488 • 1

stephenchungmh/thinker_r1_5b

2B • Updated Oct 16 • 11 • 1

stephenchungmh/thinker_q1_5b

2B • Updated Oct 16 • 7 • 1

stephenchungmh/thinker_r7b

8B • Updated Oct 16 • 6 • 1

aippolit/RENT-Qwen-7B

8B • Updated Oct 31 • 39 • 1

mradermacher/RENT-Qwen-7B-GGUF

8B • Updated Oct 31 • 163 • 1

mradermacher/RENT-Qwen-7B-i1-GGUF

8B • Updated 6 days ago • 1.47k • 1

beyoru/MinCoder-4B-Expert

Text Generation • 4B • Updated Nov 2 • 142 • 1

mradermacher/MinCoder-4B-Expert-GGUF

4B • Updated Nov 3 • 101 • 2

mradermacher/MinCoder-4B-Expert-i1-GGUF

4B • Updated 3 days ago • 1.22k • 1

beyoru/MaxCoder-4B

Text Generation • 4B • Updated Nov 7 • 3 • 1

arubittu/MathReasoner-Mini-1.5b

Text Generation • 2B • Updated 19 days ago • 72 • 1

mradermacher/MathReasoner-Mini-1.5b-GGUF

2B • Updated 20 days ago • 352