Seminar: To compress or not to compress? A finite state approach to Nen verbal morphology, Saliha Muradoglu, 14 Feb
Seminar: To compress or not to compress? A finite state approach to Nen verbal morphology
Speaker: Saliha Muradoglu
When: 14 Feb 2020, 3.30pm-5pm
Where: HC Coombs Building, Seminar room E (Nadel Room), 3.214, ANU
TAP (transcription acceleration project) focuses on using various computational methods to aid in the acceleration of transcription in the context of low resourced languages. The current pipeline entails on speech to text on an orthographic level (ELPIS) or phonemic level (Persephone). This project focuses on the morphological level.
In this talk, I will present preliminary work on a verbal morphological parser for a Papuan language Nen. Nen verbal morphology is particularly complex, with a transitive verb taking up to 1,740 unique forms. The combinatoric power exhibited by Nen raises interesting choices for analysis. Here we focus on the resolution of decomposition, contrasting two different finite state transducer (FST) models: ‘Chunking’ and fully decomposed. ‘Chunking’ refers to the concept of collating morphological segments into one, and the fully decomposed model utilizes maximal morphological decomposition as per the grammatical description of Nen. The resultant architecture shows differences in size and structural clarity. While the chunking model is almost half the size of the full decomposed counterpart, the decomposition displays higher structural order. These results show that computational methods can not only aid in data processing but also yield some linguistic insights.