What are the benchmarks of the 4 bit model vs the FP8 model?

by Grossor - opened Feb 5

Discussion

Grossor

Feb 5

What it says on the title. I'd like to know how much do we "lose" by running this particular 4bit vs the FP8 model.

bobzhuyb

StepFun org Feb 5

Hi @Grossor , due to time limit, before release we only did a sanity check by running HMMT'25 Feb, a challenging math benchmark that requires long reasoning (>64K in some cases). Here is the benchmark score we got:

vllm-bf16-baseline 98.44%
step3p5_flash_Q4_K_S.gguf 97.50%

I would say there is minimal loss, and it is still (one of) the most powerful model that can run in 128GB unified memory

Grossor

Feb 5

thanks!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment