Tuesday, July 17, 2007

NL = SPL (cont)

What are nouns? The grade school answer is a "person, place or thing". However, deeper analysis shows that nouns are a quite a bit harder to pin down than you might expect. There are several ways to categorize nouns. For our purposes, the distinction between concrete and abstract nouns is the most important. There is a fuzzy boundary between concrete and abstract nouns but clearly abstract nouns are qualitatively different in cognitive processing even if they are grammatically similar. For what follows I am only considering concrete nouns.

Programmers model nouns as objects (bundles of property values and functions for manipulating them). That's all well and good for programming but it falls short as a basis for intelligence and understanding. I see nouns as points or, more often, regions in a many dimensional space. How many dimensions? It depends, but probably thousands. However, most nouns are not atomic so even with all those dimensions, a single region of semantic space won't do. Most nouns are composed of other nouns. For example:

car(wheels(tire, rims), engine(bunch of stuff...), chassis(...), ...)

The mind is vary fluid at traversing these has-a hierarchies so it is likely the case that there is some first class neural machinery for dealing with has-a relationships. This is in some contrast to is-a relationships which are a bit more slippery and thus are more likely a product of reasoning than direct encoding. But I digress.

The point is that nouns can be modeled as a collection of points in a multidimensional space. I call these entities Semantic Vectors (and, yes, I am playing fast and loose with the mathematical meaning of vector). This is quite different from models one traditionally finds in AI. In particular, it is qualitatively different from Newell and Simon's Physical symbol system hypothesis which is the foundation of most work in AI (although, in the sense that all computation is symbol processing - lambda calculus and all that - they are the same).

Now what about adjectives, verbs and adverbs? These are functions. That is, they are functions which operate on semantic vectors to yield new semantic vectors. "A red car drove to New York" is a program in a high-level language. When it is compiled it starts with a semantic vector for a generic <car>, in particular, one whose value in the hue and position dimensions is unspecified. It applies the function red to that fact to yield a new vector <red car>. It then constructs a vector for a well know place, <New York>. It executes the function drive(...) which takes these vectors as arguments and produces more vectors, most obviously one that represents a car in the region defined by New York. Less obviously this program created a vector for a person, a kind of default value for the function drive in the context of a vehicle. So, after execution of the program, the system would conclude that there was at least one person who started out wherever the car was and ended up in New York. A efficient system would only run such a simulation at coarse degree of fidelity until the task at hand demands otherwise. So, for example, gas consumption and wheel wear would not be modeled unless the system was asked questions in that regard. If it was, those questions would similarly be executed and provide the additional vectors and functions that should enter into the simulation to yield an answer.

This is obviously a very terse description of a model for language processing. A blog is clearly not the forum for conveying huge volumes of substance! However, I hope it gives you a sense for what I have in mind when I claim Natural Languages are Semantic Programming Languages. This is a topic a certainly will revisit quite a bit in future posts.

No comments: