Representation Learning and Data Augmentation in Forensic Comparison
MetadataShow full item record
The forensic comparison problem is about determining whether observed evidence arises from a known source by comparing the evidence with known samples side by side. The opinion of a forensic comparison system is characterized by the likelihood ratio (LR). The LR has been previously calculated using generative methods. However, due to intractability of generative models in high-dimensional spaces, previous methods collapse inputs to a distance scalar based on human-engineered feature vectors. The drawbacks are the huge loss of information and the imperfection of human-engineered features. To overcome these shortcomings, we propose a discriminative method based on automatically learned image representations and data augmentation.Representation learning is about training a system to learn semantic representations from raw data, such as raw image data. The task of forensic comparison requires the representation learned to have certain interpretability, be robust and contribute to forensic comparison tasks. Representation learning can be conducted using either unsupervised learning or supervised learning. For unsupervised representation learning, we train a generative model: a variational autoencoder (VAE). For supervised representation learning, we train a discriminative model: multitask Siamese convolutional neural network (SCNN). The learned representations are expected to be transferable to different but related problems. Data augmentation is about generating samples similar to existing samples to facilitate further learning and processing. Data augmentation is often used as a regularization method to counter overfitting by increasing the amount of training data. In forensic comparison, samples from a particular source are scarce and thus data augmentation is required. One way to perform data augmentation is through geometric transformations. Another method is based on generative models such as VAE to generate samples with greater semantic variations.In this dissertation, we use the handwritten “and” and “th” images as an example of the forensic comparison problem to demonstrate that by combining unsupervised representation learning, supervised representation learning and data augmentation, we can implement a hybrid system that outperforms previous methods which are based on human-engineered features. We generalize the one-to-one forensic comparison problem to many-to-many comparisons and writer-independent one-to-many comparisons by comparing distributions formed by learned image representations in the latent space. This is done by designing and extracting parametric statistical features from those distributions. The statistical features on sample distributions are expected to capture both within-writer variability and cross-writer variability.As a baseline, we compare the performance of human-engineered image extractors such as the gradient, structural and concavity (GSC) micro features with their automatically learned counterpart. Results show that the auto-learned features perform better. We also compare distribution comparison using distance-based non-parametric methods (K-S test) with parametric statistical features extracted from image feature vectors. Experiments show that the latter trains much faster than the former and yields better verification accuracies.