Fish Audio S2 Pro MLX 4-bit

MLX 4-bit quantization of fishaudio/s2-pro, produced with mlx-audio on Apple Silicon.

Built with Fish Audio.

Provenance

Converted directly from fishaudio/s2-pro bf16 safetensors via python -m mlx_audio.convert ... -q --q-bits 4 --q-group-size 64. The codec.pth audio decoder, tokenizer, and chat template are preserved alongside the quantized LLM weights.

Quantization

  • Method: MLX affine quantization, group_size=64, 4-bit
  • Actual bits per weight after conversion: ~5.0 (some small projections are skipped and remain bf16 — this is expected behavior for mlx.nn.quantize on multimodal models)
  • Size on disk: ~2.4 GB (vs ~8 GB for bf16)
  • Target: Apple Silicon with ≥ 16 GB unified memory (comfortable on M-series with 24 GB+)

Architecture

fish_qwen3_omni — a Qwen3-based decoder-only LLM that emits audio tokens consumed by the codec.pth DualAR codec. The text side is quantized; the codec is kept as-is.

Quickstart

from mlx_audio.tts import load

model = load("majentik/fishaudio-s2-pro-MLX-4bit")
audio = model.generate(
    text="Hello, this is a test of Fish S2 Pro at 4-bit MLX.",
    voice="en_default",
)

Fish S2 Pro supports zero-shot voice cloning from a reference audio sample; see the base model card for the reference-conditioning API.

License & Use Restrictions

This model is a derivative work of Fish Audio S2 Pro, licensed under the Fish Audio Research License.

  • Research + Non-Commercial use: free, subject to the license.
  • Commercial use: requires a separate written agreement with Fish Audio — see https://fish.audio or email [email protected].
  • Use restriction: outputs must not be used to create or improve any foundational generative AI model (excluding the Model itself or derivatives of it).
  • Attribution: downstream use must include this NOTICE and display "Built with Fish Audio" in any product/service/interface that surfaces model outputs.

See LICENSE.md and NOTICE files bundled with this repository.

Languages

English, Chinese, Japanese, Korean, Spanish, Portuguese, Arabic, Russian, French, German (per base model card).

See also

Downloads last month
81
Safetensors
Model size
0.7B params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for majentik/fishaudio-s2-pro-MLX-4bit

Base model

fishaudio/s2-pro
Quantized
(8)
this model