9 votes

Community size matters when people create a new language

1 comment

  1. imperialismus Link
    Language complexity is a very difficult and controversial subject. Up until a hundred years ago, it was generally accepted that some languages were inherently more complex, with ideals being the...

    Language complexity is a very difficult and controversial subject. Up until a hundred years ago, it was generally accepted that some languages were inherently more complex, with ideals being the classical languages of higher learning, Latin and Greek. However, this attitude was heavily influenced by prevailing racist theories, with "primitive" and elevated races. For most of the latter half of the 20th century, a contrary attitude prevailed: all languages were equally complex, because they were all equally capable of communicating anything its speech communities might want to say or think about. It's an observable fact that many "simple" languages turn out to be highly complex in other aspects. There is no consensus general measure of linguistic complexity; at best, we can try to quantify one aspect, such as morphology, but then we might find additional complexity in other areas, such as syntax or phonology or semantics.

    In the past two decades, this attitude has been challenged as being an overly PC reaction to previous racist ideas. John McWorther has been one of the most ardent critics of the idea that all languages are equally complex, doing work on creoles. However, his ideas are not universally accepted, and it still remains a challenge that it's easy to accidentally fall into prejudice when speaking of "simple" and "complex" language, since that often implies simple and complex thought. Efforts have been made to formalize linguistic technology, and probably the most rigorous such attempts have been rooted in information theory, such as Kolmogorov complexity. But Kolmogorov complexity is a highly abstract concept (roughly, the length of the shortest program that can reproduce a given sequence, being related to the limits of lossless compression). It doesn't line up well at all with intuitive ideas about complexity, such as that Latin has a more complex grammar because it has more cases than English. Also note that a formal "grammar" will usually deal with phonology, morphology, syntax, semantics and pragmatics, but when laymen speak about "grammar" they mean specifically morphology (inflections, cases, verb forms, and such). There's no generally accepted method of synthesizing and quantizing all these disparate areas of language into some kind of overall "complexity score".

    John McWorther likes to say that there is no known mechanism that could explain a hypothesis that less complexity in one area of language inevitable balances out with more complexity in another. The counter to this would be that all languages are required to efficiently convey the breadth and depth of human thought. Although it is no guarantee that there will be some function OverallComplexity(L) that is conserved across linguistic evolution, it's an observable fact that as previous, useful distinctions erode, new strategies need to be developed to convey the same meaning. See, for instance, the redevelopment of a second-person plural in English (y'all, you guys, youse, etc). And without an agreed-upon definition of the OverallComplexity(L), it's kind of hard to measure anyway.

    Another problem is that this idea that a more widely spoken language is less complex ignores a different issue: Language is not unitary. English spoken in Northern Scotland is not equivalent to English spoken in Quebec or South Africa or India. The common subset of these disparate varieties of the same language will naturally shrink, but complexity re-emerges when we examine specific, local varieties.

    English, for instance, has a "simple" morphology, yet a devilishly complex syntax. When you lose precision in one area of language, such as morphology, you compensate with additional precision in another, such as word order. Many languages with more complex morphology such as Latin or Russian display a higher degree of variability of word order than English, because some of the information carried by word order in English is conveyed by inflection in these languages. English also has an above-average number of phonetic vowels, has two cross-linguistically uncommon consonants (the th sounds; many foreign speakers, and even some native speakers, have "tree" or "free" for "three"). It has a more complex relation between spelling and phonetics than many other languages. And, like I said, it has a deceptively complex syntax.

    Another example would be Chinese. Mandarin is the largest language in the world in terms of native speakers, and Chinese is one of the largest language families. Chinese has a morphology that is even simpler than English, but it also has a complex tone system (four tones in Mandarin, six in Cantonese, seven or eight in some regional varieties), and it has one of the most complex writing systems in the world. These facts do not jive well with the idea that as a language expands its influence, it becomes simpler.

    This experiment is quite far removed from the actual lived experience of language. Fortunately, we do have examples of ex nihilo language creation observed in the wild: sign languages. And experience shows that they go from nothing to the full complexity of a language in about three generations.

    Whenever someone makes a claim that a language is unable to express something that is seen as universal, such as recursion, it generates considerable controversy. See the Pirahã debate.

    All in all, this article could have done a better job of explaining the issues surrounding measures of linguistic complexity.

    7 votes