Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
danielhanchen 
posted an update 3 days ago
Post
2491
1-bit GLM-5.2 GGUF vs. Claude 4.8 Opus vs. GPT-5.5

We gave 3 models the same prompt and compared one-shot outputs.

The 1-bit GLM-5.2 GGUF ran locally on a Mac Studio M3 Ultra with 256GB RAM at ~21.6 tok/s.

Which output do you like best?
GGUF: unsloth/GLM-5.2-GGUF

Only gpts physics make sense but glm5.2 seems to be beast let alone this with 1 bit

Sample size n=1...
But seriously, it's impressive that a binary tree can output code that even executes. And at 1-bit there is no dynamic quantization (by definition), can't go lower than 1-bit!

·

I took a look and there are a wide array of different precisions in the "1-bit" quants. @danielhanchen , how can a "1-bit" model contain layers at Q8_0 and F32? Does 1-bit only refer to the quantization of the ffn* layers? In that case, what is the average precision of the entire "1-bit" model?