UM-CoEDL Zoom Seminar: Corpus annotation for typological research in discourse and grammar, Stefan Schnell, 11 Sept

Outreach, University of Melbourne

Date: 3 September 2020

Seminar: Corpus annotation for typological research in discourse and grammar: the Multi-CAST initiative

Speaker: Stefan Schnell, University of Bamberg

When: 11 September 2020, 4pm AEST

Where: via zoom - contact for zoom link


In this talk I will outline the main ideas behind the multilingual corpus project Multi-CAST that is designed for corpus-based typological research into the interaction between grammatical structure and discourse in and across diverse languages. I will first give a short overview of our current corpus and in-progress developments and then turn to the corpus annotation schemata GRAID and RefIND that form the distinctive backbone of this project. I will also discuss specific issues in morpho-syntax and reference (zero anaphors, various types of person form, argument-adjunct distinction, referent status) and related

annotation practices. In the second half of this talk I will show how analyses of GRAID and RefIND annotations can bear on research questions in the areas of discourse structure (referent introduction and tracking, referential choice), and the interaction of these with argument structure and semantic properties of arguments. Essentially, our research relates to two major areas, namely that of (production-oriented) discourse processing and information management and that of language variation and change, and I will briefly summarise some of our recent and current studies in these areas.

Multi-CAST is intended as a contribution to open (language) science. The corpus and related documentation and annotation manuals can be found at All corpus data are freely downloadable through a Creative Commons Attribution 4.0 International licence (CC BY 4.0), and are also available as the multicastR package in R.

