Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
danielhanchen 
posted an update 9 days ago

fits almst perfectly into an a6000!

·

Hopefully it runs fast for you! :)

I run it on threadripper 3970x with 256gb system ram and offloading computation layers to a gtx 1660 6gb vram. Using llama.cpp with -nkvo -kvu and all MoE on CPU. With an amazing speed on 14/TpS generation speed using q8_0. I’m amazed

·

Awesome to hear, thanks for trying them out!

awsome!