Computer Science ETDs

Publication Date



Achieving computer security requires both rigorous empirical measurement and models to understand cybersecurity phenomena and the effectiveness of defenses and interventions. To address the growing scale of cyber-insecurity, my approach to protecting users employs principled and rigorous measurements and models. In this dissertation, I examine four cybersecurity phenomena. I show that data-driven and abstract modeling can reveal surprising conclusions about longterm, persistent problems, like spam and malware, and growing threats like data-breaches and cyber conflict. I present two data-driven statistical models and two abstract models. Both of the data-driven models show that the presence of heavy-tailed distributions can make naive analysis of trends and interventions misleading. First, I examine ten years of publicly reported data breaches and find that there has been no increase in size or frequency. I also find that reported and perceived increases can be explained by the heavy-tailed nature of breaches. In the second data-driven model, I examine a large spam dataset, analyzing spam concentrations across Internet Service Providers. Again, I find that the heavy-tailed nature of spam concentrations complicates analysis. Using appropriate statistical methods, I identify unique risk factors with significant impact on local spam levels. I then use the model to estimate the effect of historical botnet takedowns and find they are frequently ineffective at reducing global spam concentrations and have highly variable local effects. Abstract models are an important tool when data are unavailable. Even without data, I evaluate both known and hypothesized interventions used by search providers to protect users from malicious websites. I present a Markov model of malware spread and study the effect of two potential interventions: blacklisting and depreferencing. I find that heavy-tailed traffic distributions obscure the effects of interventions, but with my abstract model, I showed that lowering search rankings is a viable alternative to blacklisting infected pages. Finally, I study how game-theoretic models can help clarify strategic decisions in cyber-conflict. I find that, in some circumstances, improving the attribution ability of adversaries may decrease the likelihood of escalating cyber conflict.




Security, Data-driven, Data-breach, Spam, cyberconflict, botnet takedown, drive by download, blacklisting, depreferencing, heavy-tails, lognormal, Internet Service Provider, Autonomous Systems Topology, Filtering, malware, cybersecurity, statistical models, Markov Model, Interventions

Document Type


Degree Name

Computer Science

Level of Degree


Department Name

Department of Computer Science

First Advisor

Forrest, Stephanie

First Committee Member (Chair)

Crandall, Jedidiah

Second Committee Member

Moore, Tyler

Third Committee Member

Hofmeyr, Steven