Technical Report on Web-based Visual Corpus Construction for Visual Document Understanding

11/07/2022
by   Donghyun Kim, et al.
0

We present a dataset generator engine named Web-based Visual Corpus Builder (Webvicob). Webvicob can readily construct a large-scale visual corpus (i.e., images with text annotations) from a raw Wikipedia HTML dump. In this report, we validate that Webvicob-generated data can cover a wide range of context and knowledge and helps practitioners to build a powerful Visual Document Understanding (VDU) backbone. The proposed engine is publicly available at https://github.com/clovaai/webvicob.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset