the Qwen3-Coder-30B-A3B-Instruct-UD-IQ3_XXS.gguf is amazing on my RTX 5060 TI.

#19

by amine999 - opened about 1 month ago

about 1 month ago

my command line:
llama-server.exe -m Qwen3-Coder-30B-A3B-Instruct-UD-IQ3_XXS.gguf ^
--port 8080 ^
-ngl 99 ^
-c 80000 ^
-b 1024 ^
-ub 1024 ^
--cache-type-k q4_0 ^
--cache-type-v q4_0 ^
--temp 0.1 ^
--repeat-penalty 1.0 ^
--parallel 1 ^
--api-key sk-no-key-required

the generated code is just perfect, very fast, I got a decent context window, perfectly working with Roo Code (let it warm a bit when launched, or discard the very first request).
thanks a lot, you amazing people

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment