15 November 2019

Speaking Rate and Information Revisited

Welcome back. Some people speak faster than others, right? But as the study I blogged about a couple of years ago found, regardless of how fast people speak, they convey about the same amount of information in a given period of time (Speaking Rate and Information).

To reach that conclusion, the Brown University researcher analyzed some 2,400 two-sided telephone conversations among 543 speakers and interviews with 40 speakers. He estimated information rate from two linguistic criteria, lexical (dictionary definition) and structural (syntax). The speakers were from across the U.S., and all conversations were in English.

Being worldly wise and intrigued by languages and linguistics, you of course wonder: Do speakers of other languages also convey the same amount of information in a given period of time? 

Take Spanish. Even if you don't speak Spanish, you’ve probably heard it spoken. Do Spanish speakers convey information at the same rate as English speakers?

Language Information Rates

Well, even if you don’t wonder, a team of researchers affiliated with France’s University of Lyon, the University of Hong Kong, New Zealand’s University of Canterbury and South Korea’s Ajou University set out to learn the answer.

They gave 170 native speakers of 17 different languages (10 speakers per language) 15 semantically similar texts to read in their native language (Basque, Cantonese, Catalan, English, Finnish, French, German, Hungarian, Italian, Japanese, Korean, Mandarin, Serbian, Spanish, Thai, Turkish and Vietnamese). The speakers were instructed to familiarize themselves with the texts, then read them aloud at a comfortable pace with good pronunciation while they were recorded.

Through quantitative analysis, the researchers found that the speech rate (syllables per second) and the average information density of the syllables uttered for each language were quite different. Yet when the speaker combined the two properties, the information rate balanced. Similar amounts of information were conveyed in a given period of time (about 39 bits/second plus or minus 5 bits/second).

Languages such as Spanish had higher speech rates and lower information densities; Asian languages such as Vietnamese had slower speech rates and high information densities. 

The graphed data are the average information density (ID) and corresponding speech rates (SR) for languages noted at top. (There is one value of ID per language and as many values of SR as texts read by individual speakers.) The relationship between SR and ID is represented by the yellow straight line (linear regression) and the black curved line (locally estimated scatterplot smoothing regression). Both show SR decreases with increasing ID (from advances.sciencemag.org/content/5/9/eaaw2594).
Wrap Up
The researchers’ goal was to characterize the baseline by analyzing controlled speech instead of speech in more casual, unpredictable settings. They expect, however, that the strength of their findings would decrease along a continuum from very carefully pronounced content to very informal interactions. For the latter, understanding is heavily reliant on contextual and pragmatic factors rather than the linguistic information itself.

I’m way out of my league, but a significant change of information rate with casual conversation doesn’t seem to jive with the earlier study of English speakers, which did not control speech. People were found to converse within relatively narrow bounds of communication. The speakers either spoke quickly or provided high information content, but not both, possibly to avoid providing too much or too little information in a given period of time.

That seems reasonable for other languages, at least for most speakers. Anyway, thanks for stopping by.

Multi-language information rate study in Science Advances journal: advances.sciencemag.org/content/5/9/eaaw2594
Articles on study on EurekAlert! and Discover websites:

