bartowski/mistralai_Devstral-Small-2-24B-Instruct-2512-GGUF Text Generation • 24B • Updated about 13 hours ago • 13.3k • 22
view article Article Tensor Parallelism (TP) in Transformers: 5 Minutes to Understand 6 days ago • 58
The Bestiary Collection Decensored language models made using Heretic (https://github.com/p-e-w/heretic) • 6 items • Updated 24 days ago • 70