Evolutionary linguistics

The EvoLang Causal Graph Challenge

A replicated typo - Tue, 03/27/2018 - 10:25

This year at EvoLang, I’m releasing CHIELD: The Causal Hypotheses in Evolutionary Linguistics Database.  It’s a collection of theories about the evolution of language, expressed as causal graphs.  The aim of CHIELD is to build a comprehensive overview of evolutionary approaches to language.  Hopefully it’ll help us find competing and supporting evidence, link hypotheses together into bigger theories and generally help make our ideas more transparent. You can access CHIELD right now, but hang around for details of the challenges.

The first thing that CHIELD can help express is the (sometimes unexpected) causal complexity of theories.  For example, Dunbar (2004) suggests that gossip replaced physical grooming in humans to support increasingly complicated social interactions in larger groups.  However, the whole theory is actually composed of 29 links, involving predation risk, endorphins and resource density:

The graph above might seem very complicated, but it was actually constructed just by going through the text of Dunbar (2004) and recording each claim about variables that were causally linked.  By dividing the theory into individual links it becomes easier to think about each part.

Second, CHIELD also helps find other theories that intersect with this one through variables like theory of mind, population size or the problem of freeriders, so you can also use CHIELD to explore multiple documents at once.  For example, here are all the connections that link population size and morphological complexity (9 papers so far in the database):

The first thing to notice is that there are multiple hypotheses about how population size and morphological complexity are linked.  We can also see at a glance that there are different types of evidence for each link.  Some are supported from multiple studies and methods, while others are currently just hypotheses without direct evidence.

However, CHIELD won’t work without your help!  CHIELD has built-in tools for you – yes YOU – to contribute.  You can edit data, discuss problems and add your own hypotheses.  It’s far from perfect and of course there will be disagreements.  But hopefully it will lead to productive discussions and a more cohesive field.

Which brings us to the challenges …

The EvoLang Causal Graph challenge: Contribute your own hypotheses

You can add data to CHIELD using the web interface.  The challenge is to draw your EvoLang paper as a causal graph.  It’s fun!  The first two papers to be contributed will become part of my poster at EvoLang.

Here are some tips:

  • Break down your hypothesis into individual causal links.
  • Try to use existing variable names, so that your hypothesis connects to other work.  You can find a list of variables here, or the web interface will suggest some.  But don’t be afraid to add new variables.
  • Try to add direct quotes from the paper to the “Notes” field to support the link.
  • If your paper is already included, do you agree about the interpretation? If not, you can raise an issue or edit the data yourself.

More help is available here.  Click here to add data now!  Your data will become available on CHIELD, and your name will be added to the list of contributors.

Bonus Challenge: Contribute 5 papers, become a co-author!

I’ll be writing an article about the database and some initial findings for the Journal of Language Evolution.  If you contribute 5 papers or more, then you’ll be added as a co-author.  As an incentive to contribute further, co-authors will be ordered by the number of papers they contribute.  This offer is open to anyone studying evolutionary linguistics, not just people presenting at EvoLang.  You should check first whether the paper you want to add has already been included.

Bonus Challenge: Contribute some code, become a co-author!

CHIELD is open source.  The GitHub repository for CHIELD has some outstanding issues. If you contribute some programming to address them, you’ll become a co-author on the journal article.

Robust, Causal, and Incremental Approaches to Investigating Linguistic Adaptation

A replicated typo - Thu, 03/01/2018 - 09:48

We live in an age where we have more data on more languages than ever before, and more data to link it with from other domains. This should make it easier to test hypotheses involving adaptation, and also to spot new patterns that might be explained by adaptation.  For example, the proposed link between climate and tone languages could never have been investigated without massive global databases.  However, there is not much discussion of the overall approach to research in this area.

This week I published a paper in a special issue on the Adaptive Value of Langauges, outlining the maximum robustness approach to these problems.  I then try to apply this approach to the debate about the link between tones and climate.

In a nutshell, I suggest that research should be:


Instead of aiming for the most valid test for a hypothesis, we should consider as many sources of data and as many processes as possible.  Agreement between them supports a theory, but differences can also highlight which parts of a theory are weak.


Researchers should be more explicit about the causal effects in their hypotheses.  Formal tools from causal graph theory can help formulate tests, recognise weaknesses and avoid talking past each other.


Realistically, a single paper can’t be the final word on a topic, and shouldn’t aim to.  Statistical studies of large-scale, cross-cultural data are very complicated, and we should expect small steps to establishing causality.

I applying these ideas to the debate about tone and climate.  Caleb Everett also published a paper in this issue showing that speakers in drier regions use vowels less frequently in their basic vocabulary. I test whether the original link with tone and the new link with vowels holds up when using different data sources and different statistical frameworks.  The correlation with tone is not robust, while the correlation with vowels seems more promising.

I then suggest some ideas for alternative methodological approaches to this theory that could be tested.  For example:

  • An iterated artificial learning experiment
  • A phonetic study of vowel systems
  • A historical case-study of 5 Bantu languages
  • A corpus study of tone use in Cantonese and conversational repair in Mandarin
  • A corpus study of Larry King’s speech


Resister: A sci-fi sequel about cultural evolution and academic funding

A replicated typo - Sun, 02/25/2018 - 17:27

In 2016, Casey Hattrey combined literary genres that had long been kept far apart from each other: science fiction, academic funding applications and cultural evolution theory. Space Funding Crisis I: Persister was a story that tried to “put the fun in academic funding application and the itch in hyper-niche”. It was criticised as “unrealistic and too centered on academics to be believable” and “not a very good book”. Dan Dediu’s advice was “better not even start reading it,” and Fiona Jordan’s review was literally a four-letter word. Still, that hasn’t stopped Hattrey from writing the sequel that the title of the first book tried to warn us about.

The badly conceived artwork for Resister

Space Funding Crisis II: Resister continues to follow the career of space linguist Karen Arianne. Just when she thought she’d gotten out of academia, the shadowy Central Academic Funding Council Administration pulls her back in for one more job. Or at least a part-time post-doc. Her mission: solve the mystery of the great convergence. Over thousands of years of space-faring, human linguistic diversity has exploded, but suddenly people have started speaking the same language. What could have caused this sinister twist? Who are the Panini Press? And what exactly is research insurance? Arianne’s latest adventure sees her struggle against ‘splainer bots, the conference mafia and her own inability to think about the future.

To say that this was the “difficult second book” would give too much credit to the first.  Hattrey seems to have learned nothing about writing or science since the last time they ventured into the weird world of self-published online novels. The characters have no distinct voice, the plot doesn’t make much sense and there are eye-watering levels of exposition.  In the appendix there’s even an R script which supports some of the book’s predictions, and even that is badly composed.  Even some of the apparently over-the-top futuristic ideas like insurance for research hypotheses are a bit behind existing ideas like using prediction markets for assessing replicability.

If there is a theme between the poorly formatted pages, then it’s emergence: complex patterns arising from simple rules. Arianne has a kind of spiritual belief in just reacting, Breitenberg-like, to the here-and-now rather than planning ahead. Apparently Hattrey intends this to translate into a criticism of the pressures of early-career academic life.  But this never really materialises out of the bland dialogue and insistence on putting lasers everywhere.

Still, where else are you going to find a book that makes fun of the slow science movement, generative linguistics and theories linking the emergence of tone systems to the climate?

Resister is available for free, including in various formats, including for kindle, iPad and nook. The prequel, Persister is also available (epub, kindle, iPad, nook).


CfP: Experimental approaches to iconicity in language

A replicated typo - Thu, 02/22/2018 - 15:30

Submissions are being sought for a special issue of Language and Cognition on Experimental approaches to iconicity in language. We welcome submissions related to any aspect of the many forms and functions of iconicity in natural language (see below). Papers may feature new experimental findings, or may present novel theoretical syntheses of experimental work on iconicity in language. Manuscripts should be a maximum of 8,000 words, with shorter submissions preferred.

Many researchers in language and cognition now recognize that iconicity – resemblance between form and meaning – is a fundamental feature of human languages, spoken and signed alike (Nuckolls 1999; Taub 2001; Perniss, Thompson, & Vigliocco, 2010; Dingemanse et al., 2015; Perry, Perlman & Lupyan, 2015; Ortega, 2017). Iconicity is found across all levels
of linguistic structure, spanning discourse, grammar, morphology, lexicon, phonology and phonetics, and even orthography. It is found in the prosody of speech and sign and in the gestures that accompany linguistic behaviour.

While experimental research on iconicity in speech has long favoured the study of pseudowords like bouba and kiki, a growing body of experimental research shows that iconicity plays an active role in a number of basic language processes, cutting across cognition, development, cultural and biological evolution. The special issue aims to feature some of the most exciting new experimental research on the many forms, functions, and
timescales of iconicity in human language.

Special issue editors
Marcus Perlman, University of Birmingham
Pamela Perniss, University of Brighton
Mark Dingemanse, Max Planck Institute for Psycholinguistics

Dingemanse, M., Blasi, D.E., Lupyan, G., Christiansen, M.H., & Monaghan, P. (2015). Arbitrariness, iconicity, and systematicity in language. Trends in Cognitive Sciences, 19, 603-615.
Nuckolls, J.B. (1999). The case for sound symbolism. Annual Review of Anthropology, 28, 255-282.
Ortega, Gerardo. “Iconicity and Sign Lexical Acquisition: A Review.” Frontiers in Psychology 8 (2017). https://doi.org/10.3389/fpsyg.2017.01280.
Perniss, P., Thompson, R.L., & Vigliocco, G. (2010). Iconicity as a general property of language: Evidence from spoken and signed languages. Frontiers in Psychology, 1, 227.
Perry, L.K., Perlman, M. & Lupyan, G. (2015). Iconicity in English and Spanish and its relation to lexical category and age of acquisition. PLoS ONE, 10, e0137147.
Taub, S. (2001). Language from the body: Iconicity and metaphor in American Sign Language. Cambridge: Cambridge University Press.

How to submit. If you would like to contribute, please email us an 800-1000-word abstract by 1st April, 2018. Abstracts should be sent to Marcus Perlman (m.perlman@bham.ac.uk). We will return a decision on your abstract by 15th April, and first submissions will be due on 15th August. Manuscripts will be submitted through the Language and Cognition submission interface. We aim to put out the complete issue by the beginning of 2019. Notably, submissions that proceed faster can appear online first.

CfP: Measuring Language Complexity at EvoLang

A replicated typo - Wed, 02/14/2018 - 20:15

This is a guest post from Aleksandrs Berdicevskis about the workshop Measuring Language Complexity.

A lot of evolutionary talks and papers nowadays touch upon language complexity (at least nine papers did this at the Evolang 2016). One of the reasons is probably that complexity is a very convenient testbed for testing hypotheses that establish causal links between linguistic structure and extra-linguistic factors. Do factors such as population size, or social network structure, or proportion of non-native speakers shape language change, making certain structures (for instance, those that are morphologically simpler) more evolutionary advantageous and thus more likely? Or don’t they? If they do, how exactly?

Recently, quite a lot has been published on that topic, including attempts to do rigorous quantitative tests of the existing hypotheses. One problem that all such attempts face is that complexity can be understood in many different ways, and operationalized in yet many more. And unsurprisingly, the outcome of a quantitative study depends on what you choose as your measure! Unfortunately, there currently is little consensus about how measures themselves can be evaluated and compared.

To overcome this, we organize a shared task “Measuring Language Complexity”, a satellite event of Evolang 2018, to take place in Torun on April 15. Shared tasks are widely used in computational linguistics, and we strongly believe they can prove useful in evolutionary linguistics, too. The task is to measure the linguistic complexity of a predefined set of 37 language varieties belonging to 7 families (and then discuss the results, as well as their mutual agreement/disagreement at the workshop). See the detailed CfP and other details here.

So far, the interest from the evolutionary community has been rather weak. But there is still time! We extended the deadline until February 28 and are looking forward to receiving your submissions!

Syndicate content