HMM Off-line Arabic Handwriting Recognition Using the Components of the Frame Binary Intensity, its Grayscale Transform and its Horizontal and Vertical Derivatives

H. El Moubtahij, A. Halli, K. Satori


Recognition of handwritten Arabic text still awaits accurate processes and solutions. There are a lot of complications facing an efficient handwritten Arabic recognition system, for instance, similarities of distinct character shapes, ligature and most importantly, limitless variation inhuman handwriting. The objective is to offer an analytical methodology of offline recognition of handwritten Arabic for fast implementation.
In this paper, we present an effective approach for the recognition of off-line Arabic handwritten text which is based on statistical features. The first part in the writing recognition system is decomposing the image input into text line images, pre-processing and then extracting a set of simple statistical features by a window which is sliding along the text line. The ensuing feature vectors are added into the Hidden Markov Model Toolkit (HTK). In recognized state, the concatenation of characters to words is modelled by simple lexical models. Each word is modelled by a stochastic finite-state automaton (SFSA), and the concatenation of words into phrases is modelled by an n-gram language model. Our proposed system is applied to a data corpus “Arabic-Numbers”, which contains 1905 phrases and 47 words. These sentences are written by 5 different people.


Arabic text handwritten, Hidden Markov Model Toolkit (HTK), stochastic finite-state automaton.

