OmniVoice LoRA โ€” French (fr)

Fine-tuned LoRA adapter for OmniVoice to improve zero-shot voice cloning quality for French.

Training Details

  • Base model: k2-fsa/OmniVoice (Qwen3-0.6B backbone)
  • Method: LoRA (rank=32, alpha=64, RSLoRA)
  • Target modules: Self-attention + audio projection layers
  • Training data: Best-of-N distilled French speech samples
  • Steps: 200
  • Precision: bf16
  • Hardware: NVIDIA A40 (48GB)

Usage

from omnivoice import OmniVoice
from peft import PeftModel
import torch

# Load base model
model = OmniVoice.from_pretrained(
    "k2-fsa/OmniVoice",
    device_map="cuda:0",
    dtype=torch.float16,
)

# Load LoRA adapter
model.llm = PeftModel.from_pretrained(model.llm, "amanuelbyte/omnivoice-lora-fr")
model.llm = model.llm.merge_and_unload()

# Generate
audio = model.generate(
    text="Your french text here",
    ref_audio="path/to/reference.wav",
)

IWSLT 2026

This adapter was developed for the IWSLT 2026 shared task on cross-lingual voice cloning.

Downloads last month
22
Safetensors
Model size
0.6B params
Tensor type
I64
ยท
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for amanuelbyte/omnivoice-lora-fr

Finetuned
Qwen/Qwen3-0.6B
Finetuned
k2-fsa/OmniVoice
Adapter
(12)
this model