HTR-ConvText: Leveraging Convolution and Textual Information for Handwritten Text Recognition
Paper
• 2512.05021 • Published
Handwritten & printed text recognition for Ukrainian
Upload an image → Get recognized text
| Feature | Description |
|---|---|
| Language | Ukrainian (handwritten + printed) |
| Architecture | HTR-ConvText (ResNet-18 + MobileViT), CTC decoding |
| Input | 64×3072 px, grayscale line images |
| Training | 1.7M samples, SAM, EMA, scan simulation |
| Formats | PyTorch, ONNX, Hugging Face AutoModel |
from transformers import AutoModel, AutoProcessor
processor = AutoProcessor.from_pretrained("Valerii02/ukr-htr-convtext", trust_remote_code=True)
model = AutoModel.from_pretrained("Valerii02/ukr-htr-convtext", trust_remote_code=True)
inputs = processor(images="sample.png", return_tensors="pt")
logits = model(**inputs).logits
text = processor.batch_decode(logits)[0]
print(text)
💡 Try it now: Open the Gradio demo — no code required!
This repository packages a Ukrainian OCR/ICR model for handwritten and partially printed text with a Hugging Face–native API (AutoModel + AutoProcessor).
| Source | Samples |
|---|---|
| ukrainian-handwriting-synth | Synthetic handwritten lines |
| Ukrainian Handwritten Text | ~37k segmented lines |
| Total | 1,696,499 (Train 90% / Val 5% / Test 5%) |
| Example | Image | GT | Prediction | CER | WER |
|---|---|---|---|---|---|
| 1 | ![]() |
Департаменту патрульної поліції | Департаменту нагрульної поліції | 0.065 | 0.33 |
| 2 | ![]() |
за порушення правил дорожнього руху | за порушення правил дорожнього Дуку | 0.057 | 0.20 |
Real-world inference on scanned Ukrainian documents. GT = ground truth.
| File | Purpose |
|---|---|
prepare_hf_artifacts.py |
Convert .pth checkpoint → HF artifacts |
export_onnx.py |
Export to ONNX |
validate_parity.py |
OpenCV vs PIL, PyTorch vs ONNX parity checks |
predict.py |
Single-image CLI inference |
python prepare_hf_artifacts.py \
--checkpoint-path /path/to/best_CER.pth \
--alphabet-path /path/to/alphabet.json \
--output-dir ./release
python export_onnx.py --hf-model-dir ./release --output-dir ./onnx
| Split | CER | WER | Notes |
|---|---|---|---|
| real-world (124) | 0.176 | 0.440 | Scanned docs, handwritten + printed |
Micro-averaging, format_string_for_wer normalization.
This implementation adapts ideas from DAIR-Group/HTR-ConvText. See NOTICE and CITATION.cff for details.
Upstream (HTR-ConvText):
@misc{truc2025htrconvtext,
title={HTR-ConvText: Leveraging Convolution and Textual Information for Handwritten Text Recognition},
author={Pham Thach Thanh Truc and Dang Hoai Nam and Huynh Tong Dang Khoa and Vo Nguyen Le Duy},
year={2025},
eprint={2512.05021},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2512.05021},
}
This model: See CITATION.cff for full attribution.
Apache-2.0. See LICENSE.
Base model
DAIR-Group/HTR-ConvText