Wrong outputs from GGUFs?
First, thanks for making the abliterated version!
I appreciate so much.
I am new to this field so I could be wrong, but I am afraid that both Q4_K_M and q8_0 GGUFs output wrong captions. I have not tried f16 GGUF yet.
In my tests, most of the captions were well written, well adhered the instruction, but they were apparently a wrong description of the image.
My test was conducted with the suggested Thireus version of the llama.cpp, with the temperature=0.2 and max_tokens=2048 setting.
I am still working on, but the early sampling shows that the original transformer version seems to be okay. It requires H200 or B200 GPU though...
The problem could be originated from the Thireus version of the llama.cpp, but its investigation is far beyond of my ability.
So, anyone trying to use the GGUFs, please check the output throughly.
I hope this info helps.