Poor performance with simple table extraction task

#43

by hanshupe - opened Mar 30, 2025

Mar 30, 2025

There is a lot of hype around multimodal models, such GOT.
I would like to know if others made a similar experience in practice: While they can do impressive things, they still struggle with table extraction, in cases which are straight-forward for humans.

Attached is a simple example, all I need is a reconstruction of the table as a flat CSV, preserving empty all empty cells correctly. Which open source model is able to do that?

Cyber-BlackCat

Apr 30, 2025

https://huggingface.co/spaces/yonigozlan/GOT-OCR-Transformers
just use this demo, which satisfied your requirement.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment