Arabic Handwriting text recognition offline System through using the HMM Toolkit (HTK) and stochastic finite-state automaton (SFSA)

H. El Moubtahij, A. Halli, K. Satori


The main goal of the handwriting recognition systems is to translate a manuscript or a printed text image in a digitally encoded text and interpreted by a computer. For this reason, some scientists have developed techniques that transmit to the computer the ability to read texts. In this paper, we present a handwriting Arabic text recognition system in which the text line image is the input.
Our system consists of many steps, the first one is preprocessing, which aims to improve the quality of line image, we extracted a set of local densities and statistics features by using the technique of sliding windows along a text line, then, in the recognition step, we based our process on the Hidden Markov Model using the HTK tools for the training and recognition tasks. In this step, we used the technique of concatenation to form words based on simple lexical models; each word is modelled by using the technique of stochastic finite state automaton (SFSA).
Finally, we applied this approach to a data corpus “Arabic-Numbers” and IFN/ENIT to improve and evaluate the performance of our system.


Hidden Markov Model Toolkit (HTK), Feature extraction, stochastic finite – state automaton (SFSA).

