Meet the 2015 Appen interns
Appen is a global company providing high-quality speech and search technology services, and is an industry partner with the Centre. The Appen Internship Program introduced two CoEDL-affiliated students to industry best practice in the area of Language Resource development. Student interns were supported and mentored in their CoEDL project work by experience Appen linguists who provided them with access to tools and processes relevant to their project work, along with technical support.
Ben Purser learned the basics of regular expressions and text editing using Vim, to prepare transcripts for phone alignment and find ways to make elements of the Sydney Speaks Project automated. This skill was then applied to other areas, such as commands to reformat files for data analysis purposes (eg, conversion from ELAN text files to Excel). Ben learned about how the phone alignment process works in Appen projects, including necessary edits to audio files and using pronunciation lexicons as a comparative tool. He also explored Appen’s methods of speaker recruitment and screening since the Sydney Speaks project aims to conduct future data collection in the Sydney area.
Sasha Wilmoth joined the Appen internship program to work with Felicity Meakins' longitudinal corpus of Gurindji Kriol – the largest annotated and searchable corpus of an Australian language. There are three main goals of this ongoing internship. Firstly, to fix the mor-coding so that the corpus is compatible with CLAN’s updated search function. Secondly, to develop various technical processes such as improving the consistency of mor-coding, interfacing with Excel, and performing complex searches (for example across multiple utterances). Finally, the project’s aim is to analyse optional ergativity in Gurindji Kriol, within the theoretical context of morphological complexity and language change.