nllb
Collection
21 items β’ Updated
On-device neural machine translation for 200 languages using CoreML on Apple devices (iPhone, iPad, Mac).
This is a CoreML conversion of facebook/nllb-200-distilled-600M optimized for:
.
βββ NLLB_Encoder_128.mlpackage # Encoder model (~1.5 GB)
βββ NLLB_Decoder_128.mlpackage # Decoder model (~1.7 GB)
βββ tokenizer/ # Tokenizer files
βββ example.py # Ready-to-run example
βββ language_codes.json # Language code reference
pip install coremltools transformers
# Clone this repo
git lfs install
git clone https://huggingface.co/cstr/nllb-200-coreml-128
cd nllb-200-coreml-128
from example import translate_text
# English to German
result = translate_text(
"Hello, how are you today?",
source_lang="eng_Latn",
target_lang="deu_Latn"
)
print(result) # "Hallo, wie geht es dir heute?"
from example import translate_text
# English β Spanish
translate_text("Good morning!", "eng_Latn", "spa_Latn")
# β "Β‘Buenos dΓas!"
# French β English
translate_text("Bonjour le monde", "fra_Latn", "eng_Latn")
# β "Hello world"
# Japanese β English
translate_text("γγγ«γ‘γ―", "jpn_Jpan", "eng_Latn")
# β "Hello"
import coremltools as ct
from transformers import AutoTokenizer
class Translator:
def __init__(self):
# Load once, reuse for all translations
self.encoder = ct.models.MLModel(
"NLLB_Encoder_128.mlpackage",
compute_units=ct.ComputeUnit.ALL # Use GPU
)
self.decoder = ct.models.MLModel(
"NLLB_Decoder_128.mlpackage",
compute_units=ct.ComputeUnit.ALL
)
self.tokenizer = AutoTokenizer.from_pretrained("./tokenizer")
def translate(self, text, src_lang, tgt_lang):
# Your translation logic here
pass
# Create once
translator = Translator()
# Reuse many times (fast!)
translator.translate("Hello", "eng_Latn", "deu_Latn")
translator.translate("Goodbye", "eng_Latn", "fra_Latn")
See language_codes.json for the full list of 200+ languages. Common examples:
| Language | Code |
|---|---|
| English | eng_Latn |
| German | deu_Latn |
| French | fra_Latn |
| Spanish | spa_Latn |
| Chinese (Simplified) | zho_Hans |
| Japanese | jpn_Jpan |
| Arabic | arb_Arab |
| Russian | rus_Cyrl |
Full list: NLLB Language Codes
encoder = ct.models.MLModel(
"NLLB_Encoder_128.mlpackage",
compute_units=ct.ComputeUnit.CPU_ONLY
)
texts = ["Hello", "Goodbye", "Thank you"]
translations = [translate_text(t, "eng_Latn", "deu_Latn") for t in texts]
β οΈ Non-commercial use only per NLLB license
Base model
facebook/nllb-200-distilled-600M