Creating a database and query-tools for the TELL multi-speaker linguistic corpus
MetadataShow full item record
The Turkish Electronic Living Lexicon (TELL; http://socrates.berkeley.edu:7037) represents the first large-scale effort to collect transcribed recordings of a large number of partial morphological paradigms (more than 17,000) from several speakers of Turkish. The nature of this data presents a fundamental challenge: how to model the data and design the query tools so that it is possible to find interesting phonological patterns within paradigms and across speakers. The primary dimensions along which paradigms in the database are classified are speaker and lexeme, which together can be used to identify particular utterance sets. We discuss the basic structure of the TELL database and the query tools that have been devised to access it.