Publication Date
Fall 11-4-2022
Abstract
In today’s world, deep learning models are widely used in a variety of fields. Audio
applications include speech recognition, audio classification, and music information
retrieval. In this paper, we will focus on the classification of music genres using an
artificial neural network. The development of audio machine learning techniques has
created an independence from traditional, more time-consuming signal processing
techniques. Starting with raw audio data, we will gain an understanding of what
audio is and its digital representation. Then, the focus will be on obtaining frequency
information from audio signals through the use of spectrograms. Transforming the
spectrograms into the perceptually relevant mel scale allows us to eventually extract
mel frequency cepstral coefficients (MFCC) from audio files. We will then make use
of our network architecture to process the MFCC’s. A convolutional neural network,
our network of choice here, is trained to classify audio files into one of nine musical
genres with an accuracy of 89.1% using the GTZAN dataset, which is only about 4
percentage points below the state-of-the-art performance for this dataset.
Degree Name
Mathematics
Level of Degree
Masters
Department Name
Mathematics & Statistics
First Committee Member (Chair)
Mohammad Motamed
Second Committee Member
Jehanzeb Chaudhary
Third Committee Member
Jacob Schroder
Language
English
Document Type
Thesis
Recommended Citation
Suud, Usame. "Music Genre Classification by Convolutional Neural Networks." (2022). https://digitalrepository.unm.edu/math_etds/193