Statistical Assessment of TCGA Ovarian Cancer Sequencing Dataset for Prognostic Utility
MetadataShow full item record
Ovarian cancer is the most common cause of gynecological cancer death in the United States, a vast majority of whom are diagnosed with advanced stage tumor. Current practice consists of aggressive surgical resurrection of the tumors, followed by based chemotherapy. Studying the underlying genomic alterations in ovarian cancer holds great promise for personalized diagnostic, prognostic, and therapeutic strategies. The Cancer Genome Atlas (TCGA) is one of the most popular datasets used for cancer analysis and genomic studies. Even though Surveillance, Epidemiology, and End Results Program (SEER) provides a long range of statistics for all cancer types in United States based on epidemiological research, TCGA has been known to be utilized by many due to ease of access of the data. The Cancer Genome Atlas (TCGA) project has generated next-generation sequencing data in over 500 ovarian carcinomas patients. However, the potential clinical utility of this TCGA sequencing data set in predicting the survival of ovarian cancer patients is still largely unknown. The study involves a comprehensive statistical assessment of the data composition, follow-up times and survival curves for TCGA ovarian cancer dataset. A parallel comparison with respective Surveillance, Epidemiology, and End Results Program (SEER) statistics was performed to ensure that the TCGA dataset is representative of larger population. Survival statistics of the ovarian cancer subset of TCGA dataset were matched with SEER 5-year survival rates by age, race/ethnicity, tumor stage, grade and many other risk factors. ECOG scores and RECIST criterion were used to evaluate the calculated survival rates. Power analysis was performed to evaluate the statistical power of utilizing the TCGA ovarian sequencing dataset was to build a genomic predictor of prognostic utility. The implication and limitation of this study is discussed.