Arabic handwriting recognition using machine learning approaches
Ball, Gregory Raymond
MetadataShow full item record
While handwriting recognition tasks for Latin script based languages have received considerable attention, far less work has been done on the Arabic script. Arabic poses some unique challenges, such as a larger character set, the presence of dots and diacritics, and intra-word whitespace regions. Machine learning approaches have the potential to significantly improve state of the art Arabic handwriting recognition results. This dissertation presents several such machine learning techniques, such as writer adaptation and segmentation free unconstrained text processing. We integrate these techniques into novel algorithms for general recognition, word spotting, and transcript mapping. Writer adaptation or specialization is the adjustment of handwriting recognition algorithms to a specific writer's style of handwriting. Such adjustment yields significantly improved recognition rates over a generalized recognition counterpart algorithms. Specialization is commonly used in online Latin script handwriting applications, such as for tablet computers or PDAs. Some rudimentary offline Latin script adaptation methods have been proposed recently in the literature as well. Handwriting adaptation for the Arabic script, however, is unexplored. An iterative bootstrapping model is presented which adapts a writer-independent model to a writer-dependent model using a small number of words achieving a large recognition rate increase in the process. Furthermore, a confidence weighting method is described which generates better results by weighting words based on their length. Script features unique to Arabic are discussed, as well as they are incorporated into the adaptation process. Even though Arabic has many more character classes than languages such as English, significant improvement is observed. One issue common to Arabic recognition tasks is generating candidate word regions on a page. Attempting to definitely segment the document into such regions (automatic segmentation) can meet with some success, but the performance of such an algorithm is often a limiting factor in spotting performance. Another approach is to directly scan the image on the page without attempting to generate such a definite segmentation. Such segmentation-free approaches result in better recognition at a performance cost. The algorithms discussed are tested using a database of truthed, page-length, handwritten Arabic documents. Where applicable, the literature standard IFN/ENIT database is used for testing as well. We validate our approaches by exploring the implications on such tasks as word spotting (attempting to find a query word or image and placement in a set of documents), transcript mapping (the automatic alignment of a handwritten document with its machine readable transcript), and general unconstrained recognition. Novel algorithms for these tasks are also presented. Specifically, contributions in this dissertation include novel descriptions of machine learning algorithms applied to Arabic handwriting recognition problems and quantification of the improvement generated by their usage. Examples of such algorithms include writer adaptation, versatile search, and the advantage and trade offs gained by processing such tasks as word spotting in a segmentation-free fashion instead of a segmentation-based manner.