Original FP8 Weights
I would like to use the original FP8 weights (https://huggingface.co/zai-org/GLM-4.7-FP8/) in GGUF form, but I don't see any GGUF here which has a similar size to GLM4.7 FP8 (~362GB). Will there be a purely repackaged GGUF of their FP8 weights, or can you tell me how to make it myself?
Did you manage to try our Q8_0 / Q8_K_XL version? llama.cpp doesn't really have FP8 support as of yet
What is the difference between Q8_0 and Q8_K_XL? I assume at that precission, we and not loosing too much over FP8 (I hope), but there might be a performance difference.
Q8_K_XL upcasts some important layers to BF16 vs Q8_0
Did you manage to try our Q8_0 / Q8_K_XL version? llama.cpp doesn't really have FP8 support as of yet
Hi, I wanted to try as close to the original model as I could so I waited for a reply first. Since FP8 isn't currently available, I will just use the Q8_K_XL version like you suggest.
Thank you guys for your work btw, I've been using your tutorials and models for a long time now 🤗
btw I don't know much about this, but from what you said I assume its not possible to directly convert the FP8 to Q8 gguf currently, so to make it lossless we would need to convert to FP16/BF16 instead?
btw I don't know much about this, but from what you said I assume its not possible to directly convert the FP8 to Q8 gguf currently, so to make it lossless we would need to convert to FP16/BF16 instead?
yes