Upload gemma_3_4b_it-lora-r16-nso-eng LoRA adapter

806d861 verified 18 days ago

6.39 kB

library_name: peft
base_model: google/gemma-3-4b-it
language:
  - nso
  - en
tags:
  - translation
  - african-languages
  - scientific-translation
  - afriscience-mt
  - lora
  - peft
  - gemma
license: apache-2.0
pipeline_tag: translation
model-index:
  - name: gemma_3_4b_it-lora-r16-nso-eng
    results:
      - task:
          type: translation
        metrics:
          - name: BLEU (test)
            type: bleu
            value: 37.07
          - name: chrF (test)
            type: chrf
            value: 57.33
          - name: SSA-COMET (test)
            type: comet
            value: 65.06

gemma_3_4b_it-lora-r16-nso-eng

This is a LoRA adapter for the AfriScience-MT project, enabling efficient scientific machine translation for African languages.

Adapter Description

Property	Value
Base Model	google/gemma-3-4b-it
Translation Direction	Northern Sotho → English
LoRA Rank (r)	16
LoRA Alpha	32
Training Method	QLoRA (4-bit quantization)
Domain	Scientific/Academic texts

Why LoRA?

LoRA (Low-Rank Adaptation) enables efficient fine-tuning by training only a small number of additional parameters. This adapter adds only ~8.0M parameters to the base model while achieving strong translation performance.

Evaluation Results

Performance on the AfriScience-MT test set:

Split	BLEU	chrF	SSA-COMET
Validation	41.69	61.46	66.46
Test	37.07	57.33	65.06

Metrics explanation:

BLEU: Measures n-gram overlap with reference translations (0-100, higher is better)
chrF: Character-level F-score, robust for morphologically rich languages (0-100, higher is better)
SSA-COMET: Neural metric trained for Sub-Saharan African languages, shown as percentage (0-100, higher is better) (McGill-NLP/ssa-comet-stl)

Usage

Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch

# Configure 4-bit quantization (recommended for memory efficiency)
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
)

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "google/gemma-3-4b-it",
    quantization_config=bnb_config,
    device_map="auto",
    torch_dtype=torch.bfloat16,
)
tokenizer = AutoTokenizer.from_pretrained("google/gemma-3-4b-it")

# Load LoRA adapter
adapter_name = "AfriScience-MT/gemma_3_4b_it-lora-r16-nso-eng"
model = PeftModel.from_pretrained(base_model, adapter_name)
model.eval()

# Prepare translation prompt
source_text = "Climate change significantly impacts agricultural productivity in sub-Saharan Africa."
instruction = "Translate the following Northern Sotho scientific text to English."

# Format for Gemma chat template
messages = [{"role": "user", "content": f"{instruction}\n\n{source_text}"}]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

# Generate translation
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=256,
        num_beams=5,
        early_stopping=True,
        pad_token_id=tokenizer.pad_token_id,
    )

# Decode only the generated part
generated = outputs[0][inputs["input_ids"].shape[1]:]
translation = tokenizer.decode(generated, skip_special_tokens=True)
print(translation)

Without Quantization (Full Precision)

# For GPUs with sufficient memory (>24GB for larger models)
base_model = AutoModelForCausalLM.from_pretrained(
    "google/gemma-3-4b-it",
    device_map="auto",
    torch_dtype=torch.bfloat16,
)
model = PeftModel.from_pretrained(base_model, "AfriScience-MT/gemma_3_4b_it-lora-r16-nso-eng")

Training Details

Hyperparameters

Parameter	Value
LoRA Rank (r)	16
LoRA Alpha	32
LoRA Dropout	0.05
Target Modules	q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Epochs	3
Batch Size	2
Learning Rate	2e-04
Max Sequence Length	512
Gradient Accumulation	4

Hardware Requirements

Configuration	VRAM Required
4-bit (QLoRA)	~8-12 GB
8-bit	~16-20 GB
Full precision	~24-40 GB

Reproducibility

To reproduce this adapter:

# Clone the AfriScience-MT repository
git clone https://github.com/afriscience-mt/afriscience-mt.git
cd afriscience-mt

# Install dependencies
pip install -r requirements.txt

# Run LoRA training
python -m afriscience_mt.scripts.run_lora_training \
    --data_dir ./data \
    --source_lang nso \
    --target_lang eng \
    --model_name google/gemma-3-4b-it \
    --model_type gemma \
    --lora_rank 16 \
    --output_dir ./output \
    --num_epochs 3 \
    --batch_size 4 \
    --load_in_4bit

Limitations

Domain Specificity: Optimized for scientific/academic texts; may underperform on casual or colloquial language.
Language Direction: Only supports Northern Sotho → English translation.
Base Model Required: Must be used with the google/gemma-3-4b-it base model.
Context Length: Maximum context is model-dependent; longer texts should be chunked.

Citation

If you use this adapter, please cite the AfriScience-MT project:

@inproceedings{afriscience-mt-2025,
  title={AfriScience-MT: Machine Translation for African Scientific Literature},
  author={AfriScience-MT Team},
  year={2025},
  url={https://github.com/afriscience-mt/afriscience-mt}
}

License

This adapter is released under the Apache 2.0 License.

Acknowledgments

Base model: google/gemma-3-4b-it
LoRA implementation: PEFT
Evaluation: SSA-COMET for African language assessment