Publication Date

9-1-2015

Abstract

Given the continuous growth of illicit activities on the Internet, there is a need for intelligent systems to identify malicious web pages. It has been shown that URL anal- ysis is an e\u21b5ective tool for detecting phishing, malware, and other attacks. Previous studies have performed URL classification using a combination of lexical features, network tra c, hosting information, and other strategies. These approaches require time-intensive lookups which introduce significant delay in real-time systems. This paper describes a lightweight approach for classifying malicious web pages using URL lexical analysis alone. The goal is to explore the upper-bound of the classification accuracy of a purely lexical approach. Another aim is to develop an approach which could be used in a real-time system. These goal culminate in the development of a classification system based on lexical analysis of URLs. It correctly classifies URLs of malicious web pages with 99.1% accuracy, a 0.4% false positive rate, an F1-Score of 98.7, and requires 0.62 milliseconds on average. This method substantially out- performs previously published algorithms on out-of-sample data.

Keywords

Machine Learning, Malware Detection, Classification, Malicious Web Pages, Supervised Learning, Natural Language Processing

Document Type

Thesis

Language

English

Degree Name

Computer Engineering

Level of Degree

Masters

Department Name

Electrical and Computer Engineering

First Committee Member (Chair)

Jordan, Ramiro

Second Committee Member

Lamb, Chris

Recommended Citation

Darling, Michael. "A Lexical Approach for Classifying Malicious URLs." (2015). https://digitalrepository.unm.edu/ece_etds/63

Download

COinS

Electrical and Computer Engineering ETDs

A Lexical Approach for Classifying Malicious URLs

Publication Date

Abstract

Keywords

Sponsors

Document Type

Language

Degree Name

Level of Degree

Department Name

First Committee Member (Chair)

Second Committee Member

Recommended Citation

Search

Browse

Author Corner

Links

Electrical and Computer Engineering ETDs

A Lexical Approach for Classifying Malicious URLs

Author

Publication Date

Abstract

Keywords

Sponsors

Document Type

Language

Degree Name

Level of Degree

Department Name

First Committee Member (Chair)

Second Committee Member

Recommended Citation

Share

Search

Browse

Author Corner

Links