
Electrical and Computer Engineering ETDs
Publication Date
Spring 4-15-2025
Abstract
This thesis describes the development of a speech recognition system to classify Navajo (Dine) words using Low Resource Language (LRL) datasets. Presently there are no recognized high-quality open-sourced datasets for the Dine language needed to train models for speech recognition. A small balanced dataset was designed to train several models. To overcome the scarcity of the LRL dataset, the audio recordings were augmented to account for time-stretching, amplitude variations, time shifts, small amounts of white Gaussian noise, and SpecAugmentation. The models included a Recurrent Neural Network (RNN), a Convolutional Neural Network (CNN), and a Long Short-Term Memory (LSTM) model with a leave-one-out method applied to three of the ten speakers. The results compare nine methods: SGD, Momentum, Nesterov, AdaGrad, RMSProp, Adam, Adamax, Nadam, and AdamW. This demonstrated excellent classification performance on LRL models for Navajo Speech Recognition. Our system can reliably recognize a small number of Navajo words spoken by new speakers that the System has not been previously trained on.
Keywords
Diné, Navajo Speech Recognition, Low Resource Language Datasets
Document Type
Thesis
Language
English
Degree Name
Electrical Engineering
Level of Degree
Masters
Department Name
Electrical and Computer Engineering
First Committee Member (Chair)
Dr. Mario Pattichis
Second Committee Member
Dr. Xiang Sun
Third Committee Member
Dr. Melvatha Chee
Fourth Committee Member
Dr. Ali Bidram
Recommended Citation
Sutherland, Emery M.. "Navajo Speech Recognition Using Low-Resource Language Models." (2025). https://digitalrepository.unm.edu/ece_etds/716