# 🇺🇦 Ukrainian OCR / ICR (HTR-ConvText)

Handwritten & printed text recognition for Ukrainian

Live Demo

Upload an image → Get recognized text

English · Українська

📋 Table of Contents

✨ Highlights

Feature Description
Language Ukrainian (handwritten + printed)
Architecture HTR-ConvText (ResNet-18 + MobileViT), CTC decoding
Input 64×3072 px, grayscale line images
Training 1.7M samples, SAM, EMA, scan simulation
Formats PyTorch, ONNX, Hugging Face AutoModel

🚀 Quickstart

from transformers import AutoModel, AutoProcessor

processor = AutoProcessor.from_pretrained("Valerii02/ukr-htr-convtext", trust_remote_code=True)
model = AutoModel.from_pretrained("Valerii02/ukr-htr-convtext", trust_remote_code=True)
inputs = processor(images="sample.png", return_tensors="pt")
logits = model(**inputs).logits
text = processor.batch_decode(logits)[0]
print(text)

💡 Try it now: Open the Gradio demo — no code required!

📖 Model Description

This repository packages a Ukrainian OCR/ICR model for handwritten and partially printed text with a Hugging Face–native API (AutoModel + AutoProcessor).

Architecture

  • Backbone: ResNet-18 + MobileViT (MVP), hierarchical ConvText encoder (U-Net-like down/upsampling)
  • Decoding: CTC greedy
  • Vocabulary: 151 characters (Ukrainian + symbols)

Training Data

Source Samples
ukrainian-handwriting-synth Synthetic handwritten lines
Ukrainian Handwritten Text ~37k segmented lines
Total 1,696,499 (Train 90% / Val 5% / Test 5%)

Training

  • 500k iterations, batch 16 + grad accum 4
  • SAM optimizer, EMA (decay 0.9999), TCM warmup 40k iters
  • Scan simulation & detector-error augmentations
  • Hardware: NVIDIA B200 (180GB VRAM)

🖼️ Recognition Examples

Example Image GT Prediction CER WER
1 example_1 Департаменту патрульної поліції Департаменту нагрульної поліції 0.065 0.33
2 example_2 за порушення правил дорожнього руху за порушення правил дорожнього Дуку 0.057 0.20

Real-world inference on scanned Ukrainian documents. GT = ground truth.

🛠️ Tools & Scripts

File Purpose
prepare_hf_artifacts.py Convert .pth checkpoint → HF artifacts
export_onnx.py Export to ONNX
validate_parity.py OpenCV vs PIL, PyTorch vs ONNX parity checks
predict.py Single-image CLI inference

Conversion

python prepare_hf_artifacts.py \
  --checkpoint-path /path/to/best_CER.pth \
  --alphabet-path /path/to/alphabet.json \
  --output-dir ./release

ONNX Export

python export_onnx.py --hf-model-dir ./release --output-dir ./onnx

📊 Evaluation

Split CER WER Notes
real-world (124) 0.176 0.440 Scanned docs, handwritten + printed

Micro-averaging, format_string_for_wer normalization.

⚠️ Limitations

  • Sensitive to severe blur, low contrast, non-standard page artifacts
  • Performance may drop on long lines far from training distribution
  • CTC decoding can fail on highly ambiguous character boundaries

🙏 Attribution & Citation

This implementation adapts ideas from DAIR-Group/HTR-ConvText. See NOTICE and CITATION.cff for details. Upstream (HTR-ConvText):

@misc{truc2025htrconvtext,
  title={HTR-ConvText: Leveraging Convolution and Textual Information for Handwritten Text Recognition},
  author={Pham Thach Thanh Truc and Dang Hoai Nam and Huynh Tong Dang Khoa and Vo Nguyen Le Duy},
  year={2025},
  eprint={2512.05021},
  archivePrefix={arXiv},
  primaryClass={cs.CV},
  url={https://arxiv.org/abs/2512.05021},
}

This model: See CITATION.cff for full attribution.

📄 License

Apache-2.0. See LICENSE.

⭐ Star this repo if you find it useful! · Report issues · Contributions welcome
Downloads last month
290
Safetensors
Model size
66M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Valerii02/ukr-htr-convtext

Finetuned
(1)
this model

Space using Valerii02/ukr-htr-convtext 1

Paper for Valerii02/ukr-htr-convtext