Course: Advanced statistics for linguists – tree-based and mixed-effects models in R

Times: 9am - 12:30pm

Dates: Thursday and Friday, 5 - 6 December 2019

Instructor: Dr  Martin Schweinberger, The University of Queensland

Registration: Please register here for Summer School 2019



Quantitative analyses have become and ever more important aspect of linguistic research. However, guided, systematic, hands-on introductions to advanced statistical modelling that focus specifically on the language data and issues that are particularly common in linguistics, such as small data sets and nested data structures, are notably rare.

This workshop aims to address this, by providing a hands-on guide to using tree-based and mixed-effects models. In addition to briefly recapitulating basic concepts of quantitative analysis, we will explore the theoretical underpinnings of tree-based models and mixed-effects models, and use practical examples to see how such models are implemented in R.

Assumed knowledge

Participants are expected to have some experience in R and should be able to use R, but do not have to be expert users. In addition, participants should have some experience with or  have a keen interest in quantitative analyses.

Background knowledge

The course will not assume any knowledge of advanced statistical modelling, but participants may benefit from reading Discovering Statistics Using R by Andy Field, Jeremy Miles and Zoë Field, particulary chapters 1-4, 7 and 14 (SAGE, 2014).


Dr Schweinberger would greatly appreciate if participants who would like to offer using their own data send him that data at least one month in advance to see if it is fit to serve as an example data set.


The course activities will combine:

  • presentation of information about statistical analyses, including: a guide to best practices in quantitative research; basic concepts of quantitative analyses; advantages and disadvantages of different models; testing model requirements and validating models; visualizing data and model outcomes
  • discussions about best practices in quantitative research and options if model requirements are violated (skewed data, small sample size) 
  • participant discussion of common issues arising when working with language data 
  • dissemination of materials and references relevant to statistical modelling and data analysis with R more generally.
  • Australian Government
  • The University of Queensland
  • Australian National University
  • The University of Melbourne
  • Western Sydney University