Author

Yizho Jiang

Publication Date

2-9-2010

Abstract

Latent structure techniques have recently found extensive use in regression analysis for high dimensional data. This thesis attempts to examine and expand two of such methods, Partial Least Squares (PLS) regression and Supervised Principal Component Analysis (SPCA). We propose several new algorithms, including a quadratic spline PLS, a cubic spline PLS, two fractional polynomial PLS algorithms and two multivariate SPCA algorithms. These new algorithms were compared to several popular PLS algorithms using real and simulated datasets. Cross validation was used to assess the goodness-of-fit and prediction accuracy of the various models. Strengths and weaknesses of each method were also discussed based on model stability, robustness and parsimony. The linear PLS and the multivariate SPCA methods were found to be the most robust among the methods considered, and usually produced models with good t and prediction. Nonlinear PLS methods are generally more powerful in fitting nonlinear data, but they had the tendency to over-fit, especially with small sample sizes. A forward stepwise predictor pre-screening procedure was proposed for multivariate SPCA and our examples demonstrated its effectiveness in picking a smaller number of predictors than the standard univariate testing procedure.

Degree Name

Statistics

Level of Degree

Doctoral

Department Name

Mathematics & Statistics

First Committee Member (Chair)

Edward John Bedrick

Second Committee Member

Michele Guindani

Third Committee Member

Gabriel Huerta

Fourth Committee Member

Huining Kang

Language

English

Keywords

Latent structure analysis, Regression analysis, Least squares.

Document Type

Dissertation

Share

COinS