Monday, June 11, 2007

The role of Archetypes in Semantic modeling

In a previous post I introduced the notion of Semantic Vectors. These are vectors (in the sense of the mathematical notion of a Vector Space) that can be used to model knowledge about the world. It is not yet clear to me how vectors, in and of themselves, can model much of what needs to be molded in a knowledge based system (at least without complicating the notion of a vector space so it only vaguely resembled its mathematical counterpart). This post is about one aspect of this challenge that I have begun working on in earnest. I have some hope that this challenge can be met by the model.

Imagine, if you will, a rock. If my notion of a semantic vector space has any value at all it should be able to model knowledge about a rock. Presumably a rock would be modeled as vector with explicit dimensions such as mass, density, hardness, etc. At the moment, it is not my intent to propose a specific set of dimensions sufficient to model something like a rock so this is only meant to give you a rough idea of the vector concept.

When I asked you to imagine a rock, which particular rock did you imagine?

Was it this one?

Or this one?Chances are you had a much more vague idea in your mind. Although the idea you had was vague it was probably not the idea of "Mount Everest" or "the tiniest pebble" even though theses have something rocky about them.

A system that purports to model knowledge of specific things must also model knowledge of general things. In fact, most truly intelligent behavior manifests itself as the fluid way we humans can deal with the general.

I use the term archetype to denote what must exist in a semantic model for it to effectively deal with generality.

At the moment, I will not be very specific about what archetypes are but rather I will talk about what they must do.

An archetype must place constraints on what can be the case for an x to be an instance of an archetype X. In other words if you offer rock23 as an instance of archetype ROCK there should be a well defined matching process that determines if this is the case.

An archetype must allow you to instantiate an instance of itself. Thus the ROCK archetype acts as a kind of factory for particular rocks that the system is able to conceive.

An archetype must specify what semantic dimensions are immutable and which are somewhat constrained and which are totally free. For instance, a rock is rigid, so although rocks can come in many shapes, once a particular rock is instantiated it will not typically distort without breaking into smaller pieces (lets ignore what might happen under extreme pressure or temperature for the moment). In contrast, the archetype for rock would not constrain where the rock can be located. I can imagine few places that you can put a rock where it would cease to be a rock (again lets ignore places like inside a volcano or a black hole, for now).

An archetype must model probabilities, at least in a relative sort of way. For example, there should be a notion that a perfectly uniform fire engine-red rock is less likely than a grayish-blackish-greenish rock with tiny silverish specks.

Archetypes also overlap. A BOULDER archetype overlaps a ROCK archetype and they system should know that a rock becomes a boulder by the application of the adjective BIG.

An intelligent entity must be able to reason about particular things and general classes of things. It would be rather odd and awkward, in my opinion, if the system had distinctly different ways to deal with specific things and general things. It would be nice if the system had a nice universal representation for both. Certainly, the fluid way in which humans can switch back and forth between the general and the specific lends credence to the existence of a uniform representational system. If a knowledge representation proposal (like my semantic vector concept) fails to deliver these characteristics then it should be viewed as implausible.

I am only just beginning to think in earnest out how the vector model can deal with archetypes. I have some hope but nothing that I am willing to commit to at the moment. Presently I am working with the idea that an archetype is nothing more than as set of pairs consisting of a vector and a weight. The vector provides an exemplar of an element of the archetype and the weight provides some information as to the likelihood. The nice thing about vectors is that, give two of them, a new vector can be produced that lies in the middle. Hence the vector model provides away of flushing out a sparsly populated archetype. Further, membership in the archetype can be tested using the distance metric of the semantic space.

The limitation of this approach has to do with the notion of the flexibility of various dimensions that I mentioned above. It would seem that the vector model, in and of itself, does not have an obvious way to represent such constraints. Perhaps this simply means that the model must be expanded but expansion always leads to complexity and a semantic modeler should always prefer economy. There is some hope for a solution here. The basic idea is to provide a means by which constraints can be implied by the vectors themselves but elaboration of this idea will have to wait for a future post.

No comments: