Multimodal feature learning
MetadataShow full item record
In recent years, a great many methods of learning from multimodal data by considering the diversity of different modalities have been proposed. These modalities may be obtained from multiple sources. For example, a person can be identified by face, fingerprint, signature or iris with information obtained from multiple sources, while an image can be represented by its color or texture features, which can be seen as different feature subsets of the image. In trying to learn the correlations and transitions from multimodal data, I organize my approaches into two major aspects: 1) Learning on Static Multimodal Data. We propose innovative static multimodal algorithms for description generation and disease subtype discovery. The proposed algorithms are formulated an unified framework which models each individual modality and their high-level correlations. For the textual data, we propose innovative greedy fusion algorithm as a critical component to generate a fixed-length feature vector for an input description of any length. Overall, the multimodal framework is trained as one deep network having different integrated components. These different types of high-level features are simultaneously integrated by the joint component of the framework. 2) Learning on Dynamic Multimodal Data. We investigate several time varying problems and propose novel stochastic learning algorithms to model nonlinear variances from static time frames and their transitions. For example, on dynamic multimodal problem, we propose real time feature learning algorithm based on electroencephalogram (EEG) signals and synchronized facial video, and use these jointly learned features to train a classifier for real-time recognition. Our algorithm learns the EEG dictionary unsupervisedly, and transforms a continuous sequential signal into an EEG ''sentence'' which consists of a sequence of EEG words. The EEG sentence is then jointly learned with video features into a final feature representation. Our proposed method overcome the small sample problem using the idea of convolution, and further overcomes the overfitting and computational challenges via probability pooling. With the general principles at the foundation of these algorithms, we conduct extensive evaluation for these algorithms on synthetic datasets and real-world datasets.