Min A. Lee

Publication Date



The primary purpose of this thesis is to explain the effectiveness of advertisement by predicting the attention score using the Flow of Attention graph and other survey responses. I address two problems: creating the algorithm to identify the peaks in the Flow of Attention graph and predicting the attention score based on predictor variables from questionnaire responses and the Flow of Attention graph. The sample data comprises a total of 141 randomly selected advertisements provided by Ameritest, a marketing research firm. The Problem 1 was addressed by two different algorithms; the first one is created manually based on moving average points with a window size of 3, and the other is an Edge detection function' derived from the other research. The manual moving average algorithm provided a better consistency with the reference peaks and the analysts' peaks by measurement of agreement, calculated with Cronbach's alpha. The Problem 2 was addressed by an missing imputation procedure and model selection procedure for the multiple regression model. Twenty out of twenty three variables contained missing values and they were imputed by random regression imputation procedure. Model selection methods for the imputed data included the LASSO and all possible subsets by AIC. In order to get both a reliable and stable final model, the imputation was conducted a hundred times and found that the LASSO method provided a simpler and more stable result than all possible subsets by AIC method. Based on the final results from these two methods, the attention score increased when the audience liked the commercial, felt entertained, perceived it as different from other commercials, and felt better about the company (or the brand). The results also showed that the number of peaks, which is a variable from the Flow of Attention graph, did not indicate any significant impact to the attention score, since no model selection results contained the variable. Through the statistical analysis results in this thesis, the LASSO model selection shows a high stability of the results in the multiple random regression imputed data. Trying with various numbers of imputations and with other model selection methods can be suggested as future study to confirm the compatibility of the model selection methods in the presence of missing data.

Degree Name


Level of Degree


Department Name

Mathematics & Statistics

First Committee Member (Chair)

James Degnan

Second Committee Member

Yan Lu

Third Committee Member

Li Li

Project Sponsors





Model selection techniques, Measure of agreement, Advertising data

Document Type