Several Issues in Evaluation of Diagnostic Tests and Complex Biomarkers
Assessing as well as improving the diagnostic accuracy has always been an important topic in medical research. In this thesis, several problems in evaluation of diagnostic tests and complex biomarkers are addressed. A true disease status, e.g., gold standard (GS), is necessary in order to evaluate the performance of different diagnostic tests and biomarkers. However, many times in practice, there are no true GS tests and some GS tests are not performed in the clinical practice of medicine at all. Latent class model (LCM) analysis has been widely used to assess the diagnostic accuracy without requiring the existence of GS information. For dichotomous diagnostic data with binary outcomes, we propose a fast Monte Carlo Expectation-Maximization (MCEM) algorithm for parameter and standard error estimation in LCM analysis assuming conditional independence (CI). Furthermore, the CI assumption in LCM analysis is potentially invalid in reality. Procedures for testing partial conditional dependence structures using the fast MCEM algorithm in LCM analysis are developed. In many situations, the diagnostic decision is not limited to a binary choice. With three ordinal diagnostic categories, the most commonly used measure for the overall diagnostic accuracy is the volume under the receiver operating characteristic (ROC) surface (VUS), which is the extension of the area under the ROC curve (AUC) for continuous test data with binary diagnostic outcomes. Two kernel smoothing based approaches are proposed for estimation of this summary index, namely, VUS. Different procedures for estimation of the VUS are compared in terms of bias and root mean square error in an extensive simulation study. Problems of combining multiple biomarkers to improve diagnostic accuracy are also dealt with. We propose a few parametric and nonparametric approaches to address the problem of finding the optimal linear combination to maximize the VUS with three ordinal diagnostic categories. To examine how well the estimated linear combinations perform on potential future observations, a cross-validation approach for robust evaluation of linear combination methods is investigated. Because of the same reason that a GS might not be available, assuming continuous diagnostic test data following multivariate normal distribution, we develop a procedure for comparing diagnostic accuracy with three ordinal diagnostic categories without a GS. We discuss several methods on interval estimation for the difference in paired accuracy index and investigate the possible efficiency loss compared to the ones using a GS. Two specific real-world datasets are discussed in this thesis. One is a cervical neoplasia diagnosis data from Gynecologic Oncology Group (GOG) and the other is an early stage Alzheimer's disease diagnosis data from the Washington University Knight Alzheimer's Disease Research Center (Knight ADRC). Future work will possibly include following topics: (1) Linear combination methods to improve sensitivity to early stage onset; (2) Using weighting scheme for incorporating information from imperfect GS; (3) Optimal bandwidth selection in kernel smoothing based approaches for estimation of the VUS.