Dr. Harald Baayen from the University of Tübingen visited BYU to share his work in implicit morphology, encouraging his colleagues to find better ways to predict language meaning.
PROVO, Utah (Jan. 22, 2016)—“John kicked the bucket.” Say that to any native English speaker and they’ll get your meaning: John died. The idiom is pretty widely understood, even though none of the individual words bear any connection to death. Comprehension doesn’t require knowing the definitions of the words involved, but prior exposure to the idiom in context. Today, not only will your listeners understand your meaning, but so will the programs linguists use to study language and map meaning.
Of course, languages are enormous, with varied meanings that wrap around and contradict one another, so linguists rely heavily on cues to indicate where to move next. Dr. Harald Baayen of the University of Tübingen is one linguist working on better ways to sort through the thousands of cues – each with thousands of possible outcomes – linguists encounter when working with big data.
“Can you rig up a system that starts to give proper predictions depending on support for the different meanings, given very simple aspects of the words?” Baayen asked a group of BYU linguistic professors and students. He was referring to the standard approaches in linguistics in which words and morphemes are the atomic building blocks of meaning.
But Baayen believes the current practice is too limiting. He explained, “If you take this setup as your axiom of language, then you deny yourself access to lots of regularities that concern sub-word and sub-morphemic features in the language that pattern together across words and morphemes.” The current practice, Baayen argues, focuses too much on words as atomic units. Once we start looking at subword features in sentences, there is sufficient systematicity to straightforwardly predict that “kick the bucket” means “die” but “kick the ball” means that a ball is being kicked.
But keeping track of sub-word co-occurrences is not all of the story: The details of error-driven learning are crucial for getting it right. Baayen used Pavlov’s dog to illustrate his point: Whenever Pavlov fed his dog, he would ring a bell. Eventually, the dog came to associate the ringing bell with being fed, to the point that Pavlov could ring a bell and the dog would start drooling. The experiment was later taken a step further: the researcher flashed a light while ringing the bell, theorizing that eventually the light alone would be enough to cause the dog to drool. The theory, however, was proven false, the reason being that the bell was already perfectly learned as a cue for food, and the light was not informative because it never added any further information.
Baayen explained, “One of the consequences of learning theory is that as experience with language accumulates over time, both for individual speakers and for societies, speaking and understanding become more complex and require more processing time.” When we find ourselves struggling to retrieve words or names as we get older, this does not indicate cognitive decline; it is evidence of what Baayen calls the Ecclesiastes principle: “For in much wisdom is much grief: and he that increaseth knowledge increaseth sorrow” (Ecclesiastes 1:18). In other words, name finding problems arise because we know too much.
—Samuel Wright (B.A. American Studies ’16)