Amazing
#2
by ehartford - opened
Thanks for this!
How can I do this myself?
Is it in LLM compressor?
We created this using the Speculators repository: https://github.com/vllm-project/speculators
There are a few small changes we had to make to support Gemma 4, but we are looking to land those very soon so you can try it out yourself!
Exciting project!