Romansh NLLB-200 1.3B CT2 Model Card
Romansh NLLB is a multilingual NMT model finetuned from NLLB-200-Distilled 1.3 model (https://huggingface.co/facebook/nllb-200-distilled-1.3B). It can translate between German and six Romansh varieties, which makes it the first model to generate fluent translations in those individual Romansh varieties.
The data used for the finetuning are monolingual data in all Romansh varieties, which we back-translated into German using Gemini 2.5 Flash, as well as a smaller amount of natively parallel data. A more extensive overview over the data used can be found further below.
The model has been extended by six language tokens, representing each of the Romansh varieties. The exact usage of the model is described below.
How to use
This model shows how to translate a sentence from German into Romansh,
Before using the model, make sure that ctranslate2 is installed. This library allows for efficient inference. This model is already converted to work with ctranslate2.
Installation:
pip install ctranslate2
Further information on ctranslate2 can be found under this link: https://github.com/OpenNMT/CTranslate2
Follow the below steps to translate a sentence.
- Load model and tokenizer from repo.
import ctranslate2
from huggingface_hub import snapshot_download
from transformers import NllbTokenizer
# Download the model from the HF hub. The model is already converted to CT2 format
model_path = snapshot_download(repo_id="ZurichNLP/romansh-nllb-1.3b-ct2")
translator = ctranslate2.Translator(model_path, device="cpu")
tokenizer = NllbTokenizer.from_pretrained("ZurichNLP/romansh-nllb-1.3b-ct2")
- Prepare model inputs.
source_lang_code = "deu_Latn" # German
target_lang_code = "rm-puter" # Puter
source_sentence = "Gestern hat es geschneit."
tokenizer.set_src_lang_special_tokens(source_lang_code)
source_token_ids = tokenizer.encode(source_sentence, add_special_tokens=True)
source_tokens = tokenizer.convert_ids_to_tokens(source_token_ids)
- Run the model as below.
results = translator.translate_batch(
[source_tokens],
target_prefix=[[target_lang_code]],
beam_size=4,
)
- Postprocess translation as described below.
result = results[0]
target_tokens = result.hypotheses[0]
translation = tokenizer.decode(tokenizer.convert_tokens_to_ids(target_tokens), skip_special_tokens=True)
translation
# Her ho que navieu.
The overview over the six varieties and their respective language codes can be found in the below table.
| Romansh variety | Language Code |
|---|---|
| Rumantsch Grischun | rm-rumgr |
| Sursilvan | rm-sursilv |
| Sutsilvan | rm-sutsilv |
| Surmiran | rm-surmiran |
| Puter | rm-puter |
| Vallader | rm-vallader |
Training Data
Parallel data
| Name | Romansh varieties | URL | RM tokens | DE tokens |
|---|---|---|---|---|
| Dictionary data | Rumantsch Grischun, Sursilvan, Sutsilvan, Surmiran, Puter, Vallader | https://pledarigrond.ch/rumantschgrischun | 3,298,051 | 6,530,231 |
| Mediomatix | Sursilvan, Sutsilvan, Surmiran, Puter, Vallader | https://huggingface.co/datasets/ZurichNLP/mediomatix | 6,946,947 | - |
| Press releases of Canton Grisons | Rumantsch Grischun | https://www.gr.ch/RM/medias/communicaziuns/MMStaka/Seiten/AktuelleMeldungen.aspx | 1,840,537 | 1,397,196 |
| SwissLawTranslations | Rumantsch Grischun | https://huggingface.co/datasets/joelniklaus/SwissLawTranslations | 607,517 | 494,980 |
| Parallel data contributed by RTR | Rumantsch Grischun, Sursilvan, Sutsilvan, Surmiran, Puter, Vallader | - | 646,856 | - |
| Storyweaver | Sursilvan, Sutsilvan, Surmiran, Puter, Vallader | - | 48,475 | 9,170 |
| Total | 13,926,557 | 8,833,624 |
Monolingual data
| Name | Romansh varieties | URL | Tokens |
|---|---|---|---|
| FineWeb2 | Rumantsch Grischun | https://huggingface.co/datasets/HuggingFaceFW/fineweb-2 | 48,340,896 |
| La Quotidiana (1997–2008, 2021–2025) | Rumantsch Grischun, Sursilvan, Sutsilvan, Surmiran, Puter, Vallader | https://huggingface.co/datasets/ZurichNLP/quotidiana | 38,993,608 |
| FinePDFs | Rumantsch Grischun | https://huggingface.co/datasets/HuggingFaceFW/finepdfs | 18,856,327 |
| Mediomatix (unaligned) | Sursilvan, Sutsilvan, Surmiran, Puter, Vallader | https://huggingface.co/datasets/ZurichNLP/mediomatix-raw | 4,639,563 |
| FineWiki | Rumantsch Grischun | https://huggingface.co/datasets/HuggingFaceFW/finewiki | 2,827,954 |
| Audio transcriptions by RTR | Rumantsch Grischun, Sursilvan, Sutsilvan, Surmiran, Puter, Vallader | https://developer.srgssr.ch/en/apis/rtr-linguistic | 1,467,957 |
| Theater plays | Rumantsch Grischun, Sursilvan, Sutsilvan, Surmiran, Puter, Vallader | https://huggingface.co/datasets/ZurichNLP/romansh_theater_plays | 1,079,943 |
| Municipal documents | Sursilvan, Sutsilvan, Surmiran, Vallader | https://huggingface.co/datasets/ZurichNLP/romansh-municipal-text-corpus | 318,308 |
| Historical Dictionary of Switzerland | Rumantsch Grischun | https://hls-dhs-dss.ch/rm/ | 234,715 |
| Revista digl noss Sulom | Surmiran | - | 187,247 |
| Babulins | Rumantsch Grischun, Sursilvan, Sutsilvan, Surmiran, Puter, Vallader | - | 49,444 |
| Total | 116,995,962 |
Evaluation Results
German to Romansh BLEU:
| System | Rumantsch Grischun | Sursilvan | Sutsilvan | Surmiran | Puter | Vallader |
|---|---|---|---|---|---|---|
| Gemini 2.5 Flash | 40.3 | 28.0 | 10.6 | 16.9 | 22.0 | 27.3 |
| Gemini 3 Flash (preview) | 42.1 | 32.8 | 12.7 | 21.3 | 26.4 | 29.8 |
| Gemini 3 Pro (preview) | 45.3 | 37.1 | 17.3 | 27.1 | 34.7 | 36.1 |
| Romansh NLLB-1.3b-ct2 | 48.4 | 44.5 | 40.5 | 43.0 | 44.9 | 44.6 |
Romansh to German BLEU:
| System | Rumantsch Grischun | Sursilvan | Sutsilvan | Surmiran | Puter | Vallader |
|---|---|---|---|---|---|---|
| Gemini 3 Flash (preview) | 55.2 | 50.1 | 47.4 | 52.2 | 53.0 | 60.7 |
| Gemini 3 Pro (preview) | 55.8 | 50.8 | 48.7 | 52.2 | 53.8 | 60.8 |
| Romansh NLLB-1.3b-ct2 | 49.5 | 45.3 | 43.2 | 46.1 | 48.1 | 53.9 |
Romansh to German COMET:
| System | Rumantsch Grischun | Sursilvan | Sutsilvan | Surmiran | Puter | Vallader |
|---|---|---|---|---|---|---|
| Gemini 2.5 Flash | 93.7 | 93.1 | 89.8 | 91.7 | 92.3 | 92.7 |
| Gemini 3 Flash (preview) | 93.9 | 94.0 | 92.7 | 93.4 | 92.8 | 93.9 |
| Gemini 3 Pro (preview) | 93.8 | 93.8 | 92.5 | 93.4 | 93.0 | 93.8 |
| Romansh NLLB-1.3b-ct2 | 91.2 | 90.9 | 89.2 | 89.6 | 89.9 | 89.9 |
Acknowledgements
We thank RTR and Fundaziun Patrimoni Cultural RTR for their support. We are grateful to Zachary Hopton, Diana Merkle, Anna Rutkiewicz and Sudehsna Sivakumar for help with data curation, Uniun dals Grischs for contributing dictionary data for Puter and Vallader, and Giuanna Caviezel, Not Soliva and their seminar participants for helpful feedback. We also acknowledge the contribution of the native speakers who participated in the human evaluation study. For this publication, use was made of media data made available via Swissdox@LiRI by the Linguistic Research Infrastructure of the University of Zurich (see (https://www.liri.uzh.ch/en/services/swissdox.html) for more information).
License
- Downloads last month
- 35
Model tree for ZurichNLP/romansh-nllb-1.3b-ct2
Base model
facebook/nllb-200-distilled-1.3B