This report provides a comprehensive treatment of the time-to-graduation (TTG) problem. The goal of this problem is to make predictions about the time it takes for a student to graduate from a particular institution. We develop a complete mathematical description of the TTG problem which includes data synthesis models that reveal details of the problem that are often hidden or overlooked in the literature. This report explores two vastly different approaches to this problem, including detailed algorithmic descriptions and practical examples where specific solution methods are applied to both synthetic data and real UNM student data. The first approach is based on survival analysis methods which are the dominant approach in the educational literature. While these methods might appear to be well matched to the TTG problem, their blind application often overlooks important aspects of this problem. One goal of this report is to identify some of these aspects and describe modifications to accommodate them. The second approach is based on semi-supervised learning, a type of machine learning that builds models using both labeled and unlabeled data. In particular we introduce a specially designed semi-supervised likelihood function tailored to the TTG problem, and then apply the maximum likelihood (ML) method to build the model. We derive an Expectation-Maximization (EM) algorithm to carry out this optimization. Finally, the application of these methods to UNM data reveals numerous important characteristics of the time-to-graduation for UNM students.
Hush, Don R.; Tushar Ojha; and Wisam Al-Doroubi. "The Time-to-Graduation Problem (Survival Analysis for Education Outcomes)." (2021). https://digitalrepository.unm.edu/ece_rpts/54