Original FP8 Weights

by Ano-Nimus - opened Dec 29, 2025

Dec 29, 2025

I would like to use the original FP8 weights (https://huggingface.co/zai-org/GLM-4.7-FP8/) in GGUF form, but I don't see any GGUF here which has a similar size to GLM4.7 FP8 (~362GB). Will there be a purely repackaged GGUF of their FP8 weights, or can you tell me how to make it myself?

danielhanchen

Unsloth AI org Dec 29, 2025

Did you manage to try our Q8_0 / Q8_K_XL version? llama.cpp doesn't really have FP8 support as of yet

dnhkng

Dec 29, 2025

What is the difference between Q8_0 and Q8_K_XL? I assume at that precission, we and not loosing too much over FP8 (I hope), but there might be a performance difference.

danielhanchen

Unsloth AI org Dec 30, 2025

Q8_K_XL upcasts some important layers to BF16 vs Q8_0

Ano-Nimus

Dec 30, 2025

Did you manage to try our Q8_0 / Q8_K_XL version? llama.cpp doesn't really have FP8 support as of yet

Hi, I wanted to try as close to the original model as I could so I waited for a reply first. Since FP8 isn't currently available, I will just use the Q8_K_XL version like you suggest.

Thank you guys for your work btw, I've been using your tutorials and models for a long time now 🤗

Ano-Nimus

Dec 30, 2025

btw I don't know much about this, but from what you said I assume its not possible to directly convert the FP8 to Q8 gguf currently, so to make it lossless we would need to convert to FP16/BF16 instead?

danielhanchen

Unsloth AI org Jan 4

btw I don't know much about this, but from what you said I assume its not possible to directly convert the FP8 to Q8 gguf currently, so to make it lossless we would need to convert to FP16/BF16 instead?

yes

Ano-Nimus changed discussion status to closed Jan 5

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment