Amazing

#2
by ehartford - opened

Thanks for this!
How can I do this myself?
Is it in LLM compressor?

Red Hat AI org

We created this using the Speculators repository: https://github.com/vllm-project/speculators

There are a few small changes we had to make to support Gemma 4, but we are looking to land those very soon so you can try it out yourself!

Exciting project!

Sign up or log in to comment