Open Access Open Access  Restricted Access Subscription or Fee Access

Segmentation and Recognition of Text Images Acquired by a Mobile Phone

H. E. Bahi, A. Zatni

Abstract



Segmentation and recognition of text in document images are two important steps in a document image understanding system. Several systems are proposed and used to ensure these steps, but less attention has been given to the images that are obtained by a mobile terminal. In order to overcome this limitation, we present in this paper a new text printed recognition system of document images obtained via a smartphone. Firstly, we apply a pre-processing step to extract and enhance the text region, after that we propose a new text-line segmentation algorithm that based on connected components (CCs) analysis in order to segment the text in individual lines. Finally, a bidirectional recurrent neural network (BRNN) with Gated Recurrent Unit (GRU) is trained to recognize the text-lines image. We evaluated the proposed system on ICDAR2015 Smartphone document OCR dataset. Experimental results demonstrate that BRNN-GRU model performs better with a higher computational speed compared to Long Short Time Memory (LSTM) that often used in the text recognition system.

Keywords


Segmentation, Recognition, Pre-processing, Smartphone, Recurrent neural network

Full Text:

PDF


Disclaimer/Regarding indexing issue:

We have provided the online access of all issues and papers to the indexing agencies (as given on journal web site). It’s depend on indexing agencies when, how and what manner they can index or not. Hence, we like to inform that on the basis of earlier indexing, we can’t predict the today or future indexing policy of third party (i.e. indexing agencies) as they have right to discontinue any journal at any time without prior information to the journal. So, please neither sends any question nor expects any answer from us on the behalf of third party i.e. indexing agencies.Hence, we will not issue any certificate or letter for indexing issue. Our role is just to provide the online access to them. So we do properly this and one can visit indexing agencies website to get the authentic information.