ALGORITHMS FOR AUTOMATIC OBJECT DETECTION AND LANDMARK DETECTION IN MEDICAL IMAGES
MetadataShow full item record
Object detection and landmark detection are two basic research problems in many computer vision applications. In this dissertation we research various aspects of above problems and demonstrate our work through two applications: spine X-ray image analysis and automatic patient positioning for medical image scanner.1) Spine X-ray image analysis: Evaluating spine biomechanical parameters, such as spine curvatures, vertebral body rotations, etc, from spine medical images is an important step towards diagnosing spine diseases and improving clinical outcomes by performing corrective actions to the affected area for a conservative treatment. However, manual evaluation can be time consuming, error prone and labor intensive. In order to address these challenges, we propose several automatic solutions, which allow for faster and more appropriate clini- cal intervention and lead to better clinical outcome.First, we look into automatically detecting vertebral bodies using traditional sliding window based object detection algorithms based on hand-crafted fea- ture. The two major components to this algorithm are localization and classification. The sliding window localization method is popular among many detection algorithms. However due to its repetitiveness while scanning, the computation speed per image is usually slow. To improve the efficiency, we propose a novel multi-stage object detection algorithm to effectively limit the search space and decrease the computational cost. The classification used in traditional method uses hand crafted features, which has been shown to perform inferior compared to deep learning based features in many modern applications. However due to lack of data in medical image field, it is often hard to transfer a deep learning network to medical imaging field. Toward overcoming this issue, we propose a simple yet effective approach where neural network trained on natural im- age fields are used as a stand alone feature extractor and use traditional classification methods on the extracted features. Our experimental result shows improvement over traditional methods based on hand-crafted features.Second, to automatically detect the landmarks from the spine X-ray images, we propose a novel Convolutional Neural Network (CNN) network, Spine R- CNN, which is robust in detecting multiple landmarks for random number of objects in spine X-ray images, unlike previous methods, where only one land- mark such as centroid point is detected or multiple landmarks for fixed number of objects are detected, which tends to fail in real life clinical images. To this end, we first use an object detector subnetwork to acquire the location of each visible object, then within each object a landmark localization subnetwork is used to determine the locations of landmark points. A novel grouping subnetwork is proposed to provide contextual information of each object to improve the initial landmark detections. The experimental results demonstrate promising results on our large clinical lateral lumbar X-ray image dataset of 1082 patients.2) patient positioning for medical image scanner: Detecting and localizing patient body regions at the gantry of the medical scanner helps patient positioning needed for applications such as epileptic seizures, patient modeling and scanning workflow. Traditionally patient positioning is achieved by manual observations or using a 3D camera to build a human model. However the above methods are error prone and computational expensive. In this work we propose a robust automatic body region detection algorithm using a single 2D camera, which provides real time video stream taken at the gantry of the scanner. Human body region detection problem is closely related to landmark detection problem, where each landmark is a human body region. Existing methods either fail to estimate body regions robustly under clothing cover/severe occlusion, or lack mechanism to incorporate temporal information. To overcome these limitations, we propose a deep spatiotemporal neural network, combining single frame convolutional network and multi frame recurrent neural net- work, followed by human anatomical structural inference to robustly estimate patient body regions over time. Our experiments demonstrate that the pro- posed network provide competitive performance compared with other single frame based algorithms on a large real world clinical dataset with 40K images.