Accents in Handwriting: A Hierarchical Bayesian Approach to Handwriting Analysis
MetadataShow full item record
The individuality of handwriting has been studied extensively in the handwriting analysis domain. An individual's handwriting is believed to be influenced by genetic and cultural factors. Genetic factors include pen grip style, pen pressure, Kinesthesia, motor skills etc., whereas cultural factors include learning through imitation and multilingualism. The traditional approaches in handwriting analysis generally do not attempt to model or quantify these factors. They function on the assumption that each individual's handwriting is unique, without any shared components among individuals. In our dissertation, we first provide evidence to demonstrate the existence of shared influences in handwriting. We postulate that a handwriting sample can be represented as a distribution over a finite set of handwriting styles. We introduce the concept of accent in handwriting, which is defined to be the influence that a person's native script has when learning to write in a different script. We then exploit the concept of accents in handwriting to demonstrably improve on the state of the art results in several handwriting analysis problems. We present three distinct hierarchical Bayesian models to analyze and quantify the influences in handwriting. We demonstrate that a mixture of influences of cultural and genetic factors is the ideal representation for handwriting samples. In our models, each handwritten sample is first represented as a bag of features. The feature representation is modeled as a distribution over a set of finite handwriting styles, and classification in the style space representation is performed to identify the accent. Each writing style is thus represented as a distribution over features. In addition, we propose a generic hierarchical framework for handwriting analysis problems. The first step of the framework is accent identification, after which, an accent specific model is learned for the problem. We have validated our approach on two data sets: (i) an in-house data set collected exclusively for the accents in handwriting task and, (ii) the UNIPEN data set, which has the necessary annotations for our purpose. The performance of our approach is demonstrated by comparing the proposed hierarchical approach with the state of the art approaches in various handwriting analysis problems. In particular, we have shown improved performance in both writer identification and handwriting recognition tasks. Finally, we present a novel handwritten CAPTCHA generation technique where the idea of accents in handwriting enhances the robustness of the CAPTCHA generation process.