LexiconNet: An End-to-End Handwritten Paragraph Text Recognition System

05/23/2022
by   Lalita Kumari, et al.
0

Historical documents present in the form of libraries needs to be digitised. The recognition of these unconstrained cursive handwritten documents is a challenging task. In the present work, neural network based classifier is used. The recognition of scanned document images which are easy to train on neural network based systems is usually done by a two step approach: segmentation followed by recognition. This approach has several shortcomings, which includes identification of text regions, layout diversity analysis present within pages and ground truth segmentation. These processes are prone to errors that create bottleneck in the recognition accuracies. Thus in this study, an end-to-end paragraph recognition system is presented with internal line segmentation and lexicon decoder as post processing step, which is free from those errors. We named our model as LexiconNet. In LexiconNet, given a paragraph image a combination of convolution and depth-wise separable convolutional modules generates the two dimension feature map of the image. The attention module is responsible for internal line segmentation that consequently processing a page in a line by line manner. At decoding step, we have added connectionist temporal classification based word beam search decoder as a post processing step. Our approach reports state-of-the-art results on standard datasets. The reported character error rate is 3.24 1.13 improvement from existing literature and the word error rate is 8.29 dataset with 43.02 and 7.35 results. The character error rate and word error rate reported in this work surpasses the results reported in literature.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset