the Qwen3-Coder-30B-A3B-Instruct-UD-IQ3_XXS.gguf is amazing on my RTX 5060 TI.
#19
by
amine999
- opened
my command line:
llama-server.exe -m Qwen3-Coder-30B-A3B-Instruct-UD-IQ3_XXS.gguf ^
--port 8080 ^
-ngl 99 ^
-c 80000 ^
-b 1024 ^
-ub 1024 ^
--cache-type-k q4_0 ^
--cache-type-v q4_0 ^
--temp 0.1 ^
--repeat-penalty 1.0 ^
--parallel 1 ^
--api-key sk-no-key-required
the generated code is just perfect, very fast, I got a decent context window, perfectly working with Roo Code (let it warm a bit when launched, or discard the very first request).
thanks a lot, you amazing people