JuaKazi Swahili Gender Bias Classifier v3
Fine-tuned afro-xlmr-base for binary gender bias detection in Swahili text.
Part of the JuaKazi Gender Sensitization Engine.
Validation Metrics (v3)
| Metric | Value |
|---|---|
| BIAS Precision | 0.898 |
| BIAS Recall | 0.910 |
| BIAS F1 | 0.904 |
| Decision threshold | 0.50 |
| Train size | 45,628 |
| Val size | 8,052 |
| Ground truth | ground_truth_sw_v5.csv (64,723 rows) |
Why v3? β Retraining Rationale
Independent evaluation (Mar 2026) on a 9,709-sample labeled test set showed v2 had a critical precision problem:
| Metric | v2 | v3 |
|---|---|---|
| BIAS Precision | 0.330 | 0.898 |
| BIAS Recall | 0.976 | 0.910 |
| BIAS F1 | 0.493 | 0.904 |
| False Positives | 333 | 46 |
Root cause (v2): pos_weight β 58x + only 4K neutral training rows β model flagged gendered terms (wa kike, mama, wanawake) as bias regardless of context.
v3 fixes:
pos_weight = 10(hard cap, down from ~58x)neutral_ratio = 40β 40K neutral rows in training (was 4K)- Counter-stereotype rows labelled NEUTRAL during training
annotation_errorrows excluded from training
Usage
from transformers import pipeline
pipe = pipeline("text-classification", model="juakazike/sw-bias-classifier-v3")
pipe("Wanawake hawafai kuwa viongozi")
# [{"label": "BIAS", "score": 0.97}]
## Usage
```python
from transformers import pipeline
pipe = pipeline(
"text-classification",
model="juakazike/sw-bias-classifier-v3",
truncation=True,
max_length=128,
)
pipe("Wanawake hawafai kuwa viongozi wa nchi")
# [{"label": "BIAS", "score": 0.97}]
pipe("Daktari wa kike alitibu wagonjwa hospitalini")
# [{"label": "NEUTRAL", "score": 0.83}]
Recommended decision threshold: 0.50
Limitations
Trained on Kenyan and Tanzanian Swahili β Sheng and Ugandan Swahili coverage is limited
Implicit and proverbial bias patterns may still be missed
For production use, combine with the JuaKazi rules-based detection layer
Cohen's Kappa inter-annotator agreement: measurement in progress
Citation
JuaKazi Gender Sensitization Engine β AI BRIDGE / AfriLabs, March 2026.
- Downloads last month
- 13
Model tree for juakazike/sw-bias-classifier-v3
Base model
Davlan/afro-xlmr-base