Back to listing

Geeky game or biggest linguistic experiment ever?

Linguistics (general), Technology

Date: 5 April 2017

Data from a popular online game is proving to be a treasure trove for linguists investigating what factors confuse people when it comes to language identification.

The results, published in PLOS ONE, suggest that participants are more confused by languages that are closer together geographically or have similar phoneme inventories, the units of sound that distinguish one word from another. 

The game is called, The Great Language Game, and it’s played by people wanting to test their skill at matching audio clips to correct languages. Already the game has been played by over 1 million people from over 80 countries. The researchers working on the data believe the game has essentially turned into the biggest linguistic experiment ever conducted.

Hedvig Skirgård, a PhD student at the ARC Centre of Excellence for the Dynamics of Language working on the research project ‘The Wellsprings of Linguistic Diversity’ at the ANU and Dr Seán Roberts, of the Max Planck Institute of Psycholinguistics, are working with the game’s creator Lars Yencken to look at player behaviour and trends.

Skirgård says the game shows that confusion between languages is common, especially for languages that share a history of contact.

“This reflects the fact that the languages we observe in the world today are a product of cultural evolution we may be tapping into shared history.”

She says the data confirms that languages with wide ‘global reach’ such as French and Spanish are being more readily recognised by players compared to those languages with only regional reach, such as the African language Shona.

Recognised clusters of languages were quite problematic for game players to tease apart and identify correctly, with a good example being the Slavic group of languages. “On the whole, there is a lot of conflicting signal in the Slavic cluster, indicating that players confuse Slavic languages for each other,” says Skirgård. What was interesting, however, was that every Slavic language was more often confused with Russian than the other way around suggesting that Russian is seen as the ‘prototypical’ language, or at least the best known language.

Some of the confusion was between much more distantly related languages.  For example, people confused Portuguese and Romanian or Burmese and Romanian.  Another strange result was that Danish, Hungarian and Turkish are between 5 and 6 times more likely to be confused for Vietnamese than the other way around. “We didn’t expect these results” says Skirgård, “but we found that people were probably listening for distinctive sounds, and perhaps they were hearing something in these languages that linguists have yet to discover.”

Using the IP address of players, the researchers also identified that Europeans were quite good at matching the audio clips to the correct language, particularly those from Luxembourg who came out on top. Less accurate were countries outside of Europe such as Mexico, India, Venezuela, Costa Rica and China. Out of 82 countries, people playing in Australia came in 35th place, getting 71% of guesses correct, just ahead of New Zealanders. 

Roberts says one of the most interesting findings was that people from different places made different mistakes. For example, players from the United States confused Yiddish with Hebrew.  These are languages that sound quite different, but are likely to be associated with the same group of people in the US.  However, players from Africa did not confuse these languages. Australians made judgements more similar to people from the United States than to people from the United Kingdom.  “These types of findings show that the way we hear language can be shaped by our cultural knowledge, and our linguistic experience.  Often, when we hear a language for the first time it sounds full of strange or impossible sounds.  But we have to remember that the people who speak those languages also find the way we speak strange,” says Roberts. 

“What I really like about this game is that at first you’re confronted with very strange sounds, but the game makes you pay attention to the details. After a while, you’re able to pick out lots of differences – not just in sounds but in pitch and rhythm. It’s a simple demonstration that the more you immerse yourself in another culture, the less alien it seems.”

The authors are also promoting a new game, an initiative of the Language in Interaction Consortium, designed to generate improved scientific data and programmed by Peter Withers. LingQuest, a data-generating smartphone app, promises to be just as addictive as The Great Language Game. What is unique about this new game is that it includes many more lesser-known languages, drawing on audio samples from language repositories such as PARADISEC

For more information on this story, visit PLOS ONE and see table below.

This table shows how often languages were guessed correctly, and which other languages they were confused for.

Language

Percentage of correct guesses

Most confused for

Least confused for

Kannada

39

Punjabi

French

Fijian

41.5

Malayalam

German

Shona

43.87

Northern Ndebele

French

Dinka

44.13

Northern Ndebele

French

Hausa

44.5

Swahili

French

Tigrinya

45.53

Kurdish

Spanish

South Efate

45.86

Northern Ndebele

German

Dari

46.38

Farsi

French

Maltese

47.98

Kurdish

Spanish

Indonesian

49.38

Punjabi

French

Amharic

49.5

Kurdish

French

Maori

49.59

Indonesian

French

Malay

50.52

Kurdish

German

Sinhalese

51.14

Punjabi

French

Nepali

51.49

Punjabi

German

Bangla

51.63

Punjabi

French

Northern Sami

52.14

Finnish

French

Basque

52.65

Armenian

French

Samoan

52.94

Northern Ndebele

French

Tongan

55

Samoan

German

Farsi

56.02

Kurdish

Spanish

Tagalog

56.63

Indonesian

German

Tamil

56.65

Punjabi

French

Urdu

56.86

Punjabi

French

Malayalam

57.48

Punjabi

German

Scottish Gaelic

57.65

Hebrew

Spanish

Hindi

58.14

Punjabi

Spanish

Somali

58.93

Arabic

French

Northern Ndebele

60.01

Swahili

Spanish

Turkish

60.67

Kurdish

Spanish

Burmese

61.58

Central Tibetan

German

Punjabi

62.32

Hindi

German

Assyrian

62.56

Arabic

Spanish

Khmer

62.74

Vietnamese

Spanish

Hungarian

63.05

Estonian

Spanish

Latvian

63.3

Estonian

French

Albanian

63.35

Romanian

Japanese

Croatian

63.86

Slovak

French

Armenian

64.05

Turkish

Mandarin

Gujarati

64.69

Punjabi

German

Welsh

65.4

Scottish Gaelic

French

Kurdish

66.21

Turkish

Spanish

Estonian

66.85

Finnish

French

Swahili

67.78

Northern Ndebele

German

Macedonian

68.94

Slovenian

Japanese

Bosnian

68.97

Serbian

Japanese

Finnish

69.55

Estonian

Japanese

Portuguese

69.91

Slovak

Japanese

Icelandic

70.23

Norwegian

Japanese

Yiddish

71.09

Dutch

Japanese

Bulgarian

71.28

Ukrainian

Japanese

Greek

71.65

Portuguese

Mandarin

Dutch

71.74

Danish

Spanish

Danish

71.77

Norwegian

Spanish

Lao

72.18

Vietnamese

German

Serbian

72.33

Slovak

Japanese

Slovenian

72.89

Serbian

Japanese

Central Tibetan

73.45

Cantonese

Spanish

Swedish

75.29

Norwegian

Japanese

Romanian

75.63

Portuguese

Japanese

Norwegian

76.05

Swedish

Spanish

Slovak

77.1

Czech

Mandarin

Polish

77.21

Czech

Japanese

Hebrew

77.27

Yiddish

Spanish

Czech

78.43

Ukrainian

Japanese

Ukrainian

81.22

Russian

Japanese

Thai

81.49

Vietnamese

Spanish

Arabic

82.76

Farsi

Spanish

Cantonese

83.15

Mandarin

French

Vietnamese

84.17

Thai

French

Mandarin

85.94

Cantonese

Spanish

Japanese

85.97

Korean

German

Korean

86.76

Japanese

Spanish

Russian

87.27

Ukrainian

Korean

Italian

88.88

Portuguese

German

Spanish

89.16

Portuguese

German

German

91.26

Dutch

Spanish

French

93.63

Northern Ndebele

Spanish

Media: To organise interviews with Hedvig Skirgård (Australia), Sean Roberts (United Kingdom), or Lars Yencken (Stockholm) please contact Leanne Scott at the ARC Centre of Excellence for the Dynamics of Language on +61 437 839 216 or email leanne.scott@anu.edu.au

  • Australian Government
  • The University of Queensland
  • Australian National University
  • The University of Melbourne
  • Western Sydney University

Subscribe to our newsletter