IQuestLab
/

IQuest-Coder-V1-40B-Loop-Instruct

Text Generation

iquestloopcoder

Model card Files Files and versions

🎉 llama.cpp Support Now Available!

#16

by nologik - opened 3 days ago

nologik

3 days ago

🎉 llama.cpp Support Now Available!

I'm excited to announce that IQuest-Loop-Instruct models are now fully supported in llama.cpp! 🚀

This is the world's first implementation of loop attention in the GGUF ecosystem.

What's New:

✅ Full loop attention support - Dual attention with learned per-head gates
✅ GGUF conversion - Convert PyTorch models to GGUF format
✅ Quantization support - Q4_K_M, Q5_K_M, Q8_0 quantization available
✅ Production ready - Tested and working with text generation

Quick Start:

# Run inference
./llama-cli --model IQuest-Coder-V1-40B-Loop-Instruct-q4_k_m.gguf \
    --prompt "Write a function to reverse a linked list" \
    --n-predict 256

GGUF Models Available:

Pre-converted GGUF models: https://huggingface.co/Avarok/IQuest-Coder-V1-40B-Loop-Instruct-GGUF

Sizes:

F16: 75GB
Q8_0: 40GB
Q5_K_M: 27GB
Q4_K_M: 23GB

Technical Details:

The implementation includes:

Loop iteration wrapper (loop_num=2)
Global K/V caching from Loop 0
Dual attention (local + global) with gate mixing
Full backwards compatibility with standard llama models

PR to llama.cpp: https://github.com/ggml-org/llama.cpp/pull/18680

Performance:

Tested on IQuest-Coder-V1-40B-Loop-Instruct:

Prompt processing: ~3.4 t/s
Text generation: ~0.8 t/s
Memory overhead: ~512MB for global K/V cache

Big thanks to the llama.cpp community and @ggerganov for the amazing ecosystem! 🙏

Related:

GGUF Models: https://huggingface.co/Avarok/IQuest-Coder-V1-40B-Loop-Instruct-GGUF
llama.cpp: https://github.com/ggerganov/llama.cpp
PR #18680: https://github.com/ggml-org/llama.cpp/pull/18680

2 days ago

https://github.com/ggml-org/llama.cpp/pull/18680

Rejected AI generated slop violating their contributor guidelines

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment