Why does linguistic structure exist?

Why does linguistic structure exist? The common-sense view of language is that it is an adaptation for communication, and all of the crazy nested structures (i.e. phrases inside of phrases) that we see in language don't seem necessary to communicate well: just look at morse code! We could define an alternative, nested morse code that had to be parsed with a context-free grammar to decode the message, but this seems not only unnecessary but wasteful. An alternative view can be found in the Chomskyans, who hold that language is not about communication but about thought, and would attribute the nested structure to the inherently hierarchical and recursive nature of thought.

In this post, I'm going to discuss an alternative resolution to this paradox that I've been playing around with. It started developing while writing a paper with Sharon Goldwater (which will appear soon in the Journal of Memory and Language). Briefly, several people have been exploring the proposal that human speech is adapted for communication that is efficient, in a way that is defined by information theory. Information theory defines certain limits on how quickly information can be transferred, and part of our motivation for the paper was the realization that, for natural language, there is a conflict between incremental speech (producing and comprehending speech one word or unit at a time) and information-theoretic efficiency.

The details of this realization were beyond the scope of that paper, but in this post I want to describe this conflict and explore some future directions that capitalize on it directly.

Language model fail

I'm just back to Scotland after two weeks in the US, and was getting on a bus into the city center. I stopped and asked the bus driver the ticket price, and heard " 2.10." I said "OK", and got my wallet out to pay. He looks at me like I'm an idiot and says forcefully "sit down!" Confused, I go sit down. After we arrived I asked if I should pay, and he looked at me like an idiot (again) and said that the bus is paid for by two other bigger bus companies as a connection, and is free to riders. My expectation of a price was so strong that I must have heard "sit down" as "2.10." Possibly also I was not quite acclimated back to the language variety this side of the pond.

Quick update

I haven't forgotten about this blog! In the last post, I promised to describe basic computational modelling approaches and what they can tell us about cognition. I've drafted that post a few times, but haven't been satisfied with it. So I'm going to adopt a different strategy and spend more than one post on each computational modelling approach. I want to present background on each approach, that will be technical enough to understand what's going on but not so technical as to dominate readers' time. This involves drawing lots of pictures, which is nice but slow. Then, I will talk about a specific application of the approach. So the posts are coming! And currently half-baked (or one-fifth-baked?) on my hard drive :p

First post

I'm currently working on a PhD in Informatics at the University of Edinburgh. I'm interested in human languages, especially from a computational perspective. A lot of posts here will talk about language. I'm hoping for this to be a place for me to step away from the details of my work and work through some of the more "big picture" aspects of computational psycholinguistics and computational cognitive science.

I'm also a free and open source software enthusiast, so there will probably be some posts about cool tricks I've learned about. I use arch linux on my laptop and desktop, and cyanogenmod 9 on my tablet.

I expect to stay away from politics unless I have something really great and novel to say.

Well, that's all for now.