Show simple item record

dc.contributor.authorDiCanio, Christian
dc.contributor.authorBunnell, H. Timothy
dc.contributor.authorAmith, Jonathan
dc.contributor.authorCastillo Garcia, Rey
dc.contributor.authorNam, Hosung
dc.contributor.authorWhalen, Douglas H.
dc.date.accessioned2016-01-12T18:00:33Z
dc.date.available2016-01-12T18:00:33Z
dc.date.issued2013
dc.identifier.citationDiCanio, Christian, Nam, H., Whalen, D. H., Bunnell, H. T., Amith, J. D., and Castillo García, R. (2013) Using automatic alignment to analyze endangered language data: Testing the viability of untrained alignment. Journal of the Acoustical Society of America, 134(3):2235-2246.en_US
dc.identifier.otherDOI:http://dx.doi.org/10.1121/1.4816491
dc.identifier.urihttp://hdl.handle.net/10477/41242
dc.description.abstractWhile efforts to document endangered languages have steadily increased, the phonetic analysis of endangered language data remains a challenge. The transcription of large documentation corpora is, by itself, a tremendous feat. Yet, the process of segmentation remains a bottleneck for research with data of this kind. This paper examines whether a speech processing tool, forced alignment, can facilitate the segmentation task for small data sets, even when the target language differs from the training language. The authors also examined whether a phone set with contextualization outperforms a more general one. The accuracy of two forced aligners trained on English (HMALIGN andP2FA) was assessed using corpus data from Yolox ochitl Mixtec. Overall, agreement performance was relatively good, with accuracy at 70.9% within 30 ms for HMALIGN and 65.7% within 30 ms forP2FA. Segmental and tonal categories influenced accuracy as well. For instance, additional stop allophones in HMALIGN’s phone set aided alignment accuracy. Agreement differences between aligners also corresponded closely with the types of data on which the aligners were trained. Overall, using existing alignment systems was found to have potential for making phonetic analysis of small cor-pora more efficient, with more allophonic phone sets providing better agreement than general ones.en_US
dc.language.isoen_USen_US
dc.subjectendangered languagesen_US
dc.subjectphonetic analysisen_US
dc.subjectsegmentationen_US
dc.subjectautomatic alignmenten_US
dc.titleUsing automatic alignment to analyze endangered language data: Testing the viability of untrained alignmenten_US
dc.typeArticleen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record