bitmamba-zen-distillix-300m

This is a 1.58-bit BitMamba (Zen) model trained via Superposition Distillation.

Model Details

This model requires the custom BitMambaStudent class definition to run. It was trained as a Proof-of-Concept for Superposition Distillation.

GGUF

Model size

0.3B params

Architecture

mamba

Hardware compatibility

5-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support