https://huggingface.co/cerebras/MiniMax-M2-REAP-162B-A10B

#1536
by cocorang - opened

We tried quantizing this model in the past but it failed due to having a BPE pre-tokenizer not supported by llama.cpp. We could try quantizing it using the same one as used for MiniMax-M2. It's just a pruned version of it so I don't really see why it would need a different BPE pre-tokenizer.

It somehow just worked this time without me having to do anything. Same llama.cpp version and I think same version of the model so no idea why it suddenly worked but it did and was already generating quants for the past 6 hours. :D

You can check for progress at http://hf.tst.eu/status.html or regularly check the model
summary page at https://hf.tst.eu/model#MiniMax-M2-REAP-162B-A10B-GGUF for quants to appear.

Sign up or log in to comment