Publication Date

Summer 5-28-2017

Abstract

Online data contains a wealth of information, but as with most user-generated content, it is full of noise, fraud, and automated behavior. The prevalence of "junk" and fraudulent text affects users, businesses, and researchers alike. To make matters worse, there is a lack of ground truth data for these types of text, and the appearance of the text is constantly changing as fraudsters adapt to pressures from hosting sites. The goal of my dissertation is therefore to extract high-quality content from and identify fraudulent and automated behavior in large, complex social media datasets in the absence of ground truth data. Specifically, in my dissertation I design a collection of data inspection, filtering, fusion, mining, and exploration algorithms to: automate data cleaning to produce usable data for mining algorithms, quantify the trustworthiness of business behavior in online e-commerce sites, and efficiently identify automated accounts in large and constantly changing social networks. The main components of this work include: noise removal, data fusion, multi-source feature generation, network exploration, and anomaly detection.

Language

English

Keywords

Bot detection, anomaly detection, unsupervised methods, spam, Twitter, review spam

Document Type

Dissertation

Degree Name

Computer Science

Level of Degree

Doctoral

Department Name

Department of Computer Science

First Committee Member (Chair)

Abdullah Mueen

Second Committee Member

Jedidiah Crandall

Third Committee Member

Shuang Luan

Fourth Committee Member

Michalis Faloutsos

Recommended Citation

Minnich, Amanda Jean. "Spam, Fraud, and Bots: Improving the Integrity of Online Social Media Data." (2017). https://digitalrepository.unm.edu/cs_etds/85

Download

Included in

Artificial Intelligence and Robotics Commons

COinS

Computer Science ETDs

Spam, Fraud, and Bots: Improving the Integrity of Online Social Media Data

Publication Date

Abstract

Language

Keywords

Document Type

Degree Name

Level of Degree

Department Name

First Committee Member (Chair)

Second Committee Member

Third Committee Member

Fourth Committee Member

Recommended Citation

Included in

Search

Browse

Author Corner

Links

Computer Science ETDs

Spam, Fraud, and Bots: Improving the Integrity of Online Social Media Data

Author

Publication Date

Abstract

Language

Keywords

Document Type

Degree Name

Level of Degree

Department Name

First Committee Member (Chair)

Second Committee Member

Third Committee Member

Fourth Committee Member

Recommended Citation

Included in

Share

Search

Browse

Author Corner

Links