9.31mb first part Q5?

by inritwritten - opened 5 days ago

5 days ago

I see the first part of Q5 is only 9.3mb. Before I download this, was there an upload problem with that file? Thanks for your hard work on these model conversions!

mtcl

5 days ago

I think it's intentional, it's the same in iq3.

ubergarm

Owner 5 days ago

@inritwritten

good eye, and yes that is intentional like @mtcl noticed. there is an argument in the the gguf-split command to only have metadata and no tensor data in the first split. this makes it easier to only upload/download the very small 1st split if there is some error later that the original model updates something like the chat template etc.

i've not done it before so it is new for my collections. i believe the ggml-org folks do similar sometimes.

here is the full split command for example:

numactl -N "$SOCKET" -m "$SOCKET" \
./build/bin/llama-gguf-split \
    --dry-run \
    --split \
    --split-max-size 50G \
    --no-tensor-first-split \
    "$model" \
    /mnt/raid/hf/GLM-4.7-GGUF/smol-IQ1_KT/GLM-4.7-smol-IQ1_KT

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment