Presenter Information

Lukas Denk, University of New Mexico

Program

Linguistics

College

Arts and Sciences

Student Level

Doctoral

Location

Student Union Building, Ballroom C

Start Date

8-11-2021 11:00 AM

End Date

8-11-2021 1:00 PM

Abstract

Linguistic research and language instruction have benefitted from linguistic corpora (digital databases containing numerous texts written in one language). For low-resource languages where written records are scarce, the creation of corpora can be a challenge, but has the benefit of supporting conservation and revitalization. My project consists in the creation of a corpus of Diné Bizaad (Navajo), based on existing publicly available narratives written in the 1950's. I present examples from the corpus as well as mention challenges faced in the annotation of the first 5000 words. One problem that pertains to Native American Languages in general is that only few speakers are trained in linguistic analysis. Learning how to disentangle the meaning in words is necessary for the creation of a grammatically annotated corpus, and it requires long training, offered only at certain institutions. Another challenge results from the particular structure of the language: even for linguists, the annotation of Navajo words is not straightforward, since grammatical rules are word-specific. Unlike in English or Spanish, word components in Navajo comprise both lexical and grammatical information, which means that every word has a different grammatical pattern. Resources like the Navajo Dictionary and the Analytical Lexicon provide support in resolving ambiguities in annotation. On the other hand, students and instructors of the Navajo Language Program at UNM have sufficient expertise to collaborate in this project. Using the free software FieldWorks Language Explorer and the depository Language Depot, the corpus can be accessed, downloaded and expanded. The goal of building this corpus is to facilitate empirical research on Navajo and to improve education by providing enough data that can be used to develop exercises and assignments.

Denk-Poster.pdf (1285 kB)
Lukas's Poster

Share

COinS
 
Nov 8th, 11:00 AM Nov 8th, 1:00 PM

Creating a corpus of Navajo Historical Narratives - Prospects and Challenges

Student Union Building, Ballroom C

Linguistic research and language instruction have benefitted from linguistic corpora (digital databases containing numerous texts written in one language). For low-resource languages where written records are scarce, the creation of corpora can be a challenge, but has the benefit of supporting conservation and revitalization. My project consists in the creation of a corpus of Diné Bizaad (Navajo), based on existing publicly available narratives written in the 1950's. I present examples from the corpus as well as mention challenges faced in the annotation of the first 5000 words. One problem that pertains to Native American Languages in general is that only few speakers are trained in linguistic analysis. Learning how to disentangle the meaning in words is necessary for the creation of a grammatically annotated corpus, and it requires long training, offered only at certain institutions. Another challenge results from the particular structure of the language: even for linguists, the annotation of Navajo words is not straightforward, since grammatical rules are word-specific. Unlike in English or Spanish, word components in Navajo comprise both lexical and grammatical information, which means that every word has a different grammatical pattern. Resources like the Navajo Dictionary and the Analytical Lexicon provide support in resolving ambiguities in annotation. On the other hand, students and instructors of the Navajo Language Program at UNM have sufficient expertise to collaborate in this project. Using the free software FieldWorks Language Explorer and the depository Language Depot, the corpus can be accessed, downloaded and expanded. The goal of building this corpus is to facilitate empirical research on Navajo and to improve education by providing enough data that can be used to develop exercises and assignments.

 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.