Publication Date

Fall 11-14-2022


Differential gene expression analysis has the potential to discover candidate biomarkers, therapeutic targets, and gene signatures. How to save money when using an unaffordable sample is a practical question. The case-cohort (CCH) study design can blend the economy of case-control studies with the advantages of cohort studies. But it has not been seen in the medical research literature where high-throughput genomic data were involved.

A score test does not need to fit the Cox PH model iteratively; hence, it can save computing time and avoid potential convergence issues. We developed a score test under the CCH design to identify DEGs associated with survival outcomes. We provided asymptotic distribution theory and inferential procedures for the test. We also verified the validity of the inferential procedure in finite samples through simulation studies.

When a permutation-based score test is used for survival outcome-related DEG analysis, the strong PH and probability distribution assumptions do not need to be a concern. However, it cannot be directly applied to the data from a CCH study design because a CCH sample is not a random sample. We developed a procedure to reconstruct a full cohort from a CCH sample and then perform the permutation-based score test on the reconstructed full cohort to identify the DEGs associated with survival outcomes. We evaluated our testing procedures and compared our methods with other existing approaches in terms of the FDR and the power through the simulation study and the application to the real datasets from two cancer-related genomic studies.

Degree Name


Level of Degree


Department Name

Mathematics & Statistics

First Committee Member (Chair)

Yan Lu

Second Committee Member

Huining Kang

Third Committee Member

Guoyi Zhang

Fourth Committee Member

Fletcher Christensen




Differential gene analysis, CCH-based score test, CCH-based Permutation test

Document Type