OmniTranslate 1.1
OmniTranslate 1.1 is a massively multilingual machine translation model supporting over 500 languages. Fine-tuned from Qwen 3 0.6B (with Unsloth), this model is designed for translation tasks on any device!
Features
- 500+ Languages Supported: The broadest coverage of languages supported for a translation model that's under 1 billion parameters!
- Tiny Size: Beats any other large model on speed and memory usage. No other model is able to compete with this!
Improvements over 1.0
- OmniTranslate now makes less hiccups when translating to Romanian (like "ami"), and the diacritic bug on Romanian translations has been mostly fixed!
There's a tiny chance that the model will spit out without diacritics (mostly due to seeds) though, so try a different one.
Experimental Features
- We added 2 new languages, Emoji and Sulfuristic Speak (my own language for OmniTranslate 1.1 to quite fit the Chaos Cubed Minecraft vibe!). Try these out:
Emoji
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# 1. Load from your Hugging Face Repo
model_id = "MihaiPopa-1/OmniTranslate-1.1"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.float32, # Standard for CPU
device_map="cpu" # Forces CPU usage
)
# 2. Translate to Emoji
prompt = "<|im_start|>user\nTranslate to emj_Emoj: We love the world!<|im_end|>\n<|im_start|>assistant\n<think>\n"
inputs = tokenizer(prompt, return_tensors="pt").to("cpu")
with torch.no_grad():
outputs = model.generate(**inputs, max_new_tokens=64, temperature=0.1)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Sulfuristic Speak
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# 1. Load from your Hugging Face Repo
model_id = "MihaiPopa-1/OmniTranslate-1.1"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.float32, # Standard for CPU
device_map="cpu" # Forces CPU usage
)
# 2. Translate to Sulfuristic Speak ("Translate to Sulfuristic Speak" also works too!)
prompt = "<|im_start|>user\nTranslate to sul_Latn: Let's ride a Sulfur Cube!<|im_end|>\n<|im_start|>assistant\n<think>\n"
inputs = tokenizer(prompt, return_tensors="pt").to("cpu")
with torch.no_grad():
outputs = model.generate(**inputs, max_new_tokens=128, temperature=0.1)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Notes
OmniTranslate 1.1 is still a experimental model and shouldn't be used for tasks where accurate translations matter.
Notes
Providing the ISO code instead of the language name can improve the results a lot.
Usage
Code is by Gemini 3 Flash (then some little modifications by myself):
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# 1. Load from your Hugging Face Repo
model_id = "MihaiPopa-1/OmniTranslate-1.1"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.float32, # Standard for CPU
device_map="cpu" # Forces CPU usage
)
# 2. Translate (replace ron_Latn with your language here)
prompt = "<|im_start|>user\nTranslate to ron_Latn: OmniTranslate is a massively multilingual machine translation model supporting over 500 languages!<|im_end|>\n<|im_start|>assistant\n<think>\n"
inputs = tokenizer(prompt, return_tensors="pt").to("cpu")
with torch.no_grad():
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.1)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Data Used
I used my own OmniSurgical 1.1, which the dataset itself contains a part of HF's FineTranslations
Uploaded finetuned model
- Developed by: MihaiPopa-1
- License: apache-2.0
- Finetuned from model : unsloth/qwen3-0.6b-unsloth-bnb-4bit
This qwen3 model was trained 2x faster with Unsloth and Huggingface's TRL library.
- Downloads last month
- 426
