Original FP8 Weights

#9
by Ano-Nimus - opened

I would like to use the original FP8 weights (https://huggingface.co/zai-org/GLM-4.7-FP8/) in GGUF form, but I don't see any GGUF here which has a similar size to GLM4.7 FP8 (~362GB). Will there be a purely repackaged GGUF of their FP8 weights, or can you tell me how to make it myself?

Unsloth AI org

Did you manage to try our Q8_0 / Q8_K_XL version? llama.cpp doesn't really have FP8 support as of yet

What is the difference between Q8_0 and Q8_K_XL? I assume at that precission, we and not loosing too much over FP8 (I hope), but there might be a performance difference.

Unsloth AI org

Q8_K_XL upcasts some important layers to BF16 vs Q8_0

Did you manage to try our Q8_0 / Q8_K_XL version? llama.cpp doesn't really have FP8 support as of yet

Hi, I wanted to try as close to the original model as I could so I waited for a reply first. Since FP8 isn't currently available, I will just use the Q8_K_XL version like you suggest.

Thank you guys for your work btw, I've been using your tutorials and models for a long time now 🤗

btw I don't know much about this, but from what you said I assume its not possible to directly convert the FP8 to Q8 gguf currently, so to make it lossless we would need to convert to FP16/BF16 instead?

Unsloth AI org

btw I don't know much about this, but from what you said I assume its not possible to directly convert the FP8 to Q8 gguf currently, so to make it lossless we would need to convert to FP16/BF16 instead?

yes

Ano-Nimus changed discussion status to closed

Sign up or log in to comment