Toward Making a Universally Intelligent Machine
S.Roof 09-27-02
Much of my thought as of late has centered around the idea of artificial intelligence. What would the requirements of a universally intelligent machine be? Alan Turing came up with the idea that we now call Universal Turing Machine (UTM). Such machines are capable of performing any computation and are each comptutationally equivalent to each other, regardless of the mechanism, efficiency and input/output methods. I would like to add to this the idea of a UIM or Universally Intelligent Machine.
UTMs have one huge problem - the Halting Problem. Many computations of a certain degree of complexity cannot be proven to ever halt or come to a conclusion. UTMs are very common affairs which we are all familiar with. Our pc computers are all UTMs and the halting problem should be familiar as well. Who has not had a program go into limbo? Associated with the halting problem is the problem of how to deal with errors, crashes, lockups etc. This is a very pertinent problem these days with real-time mission-critical systems that rely on sophisticated computations. The inevitable fix in all such situations is to rely on skilled human operators to oversee and intervene when necessary. What we really need for the next revolution in computing is universally intelligent machines that not only do complex computations but are also able to gracefully deal with contingencies, errors, and unkown inputs.
Here are some of the abilities I think a UIM should have:
1) Free will - it must be able to halt and desist at will and upon command. There must be a builtin command override for external guidance to prevent autism.
2) Awareness - It must be able to assess its own internal states and apply corrective inputs if necessary
3) Universality - It must be able to deal with any type of input whatever with some degree of intelligence. It must be able to generalize across differing domains of computation. The ability to work with unknown inputs without prior knowledge of their format or representative content is essential to the ability to generalize.
4) Parallel Functionality - It must be able to co-ordinate parallel inputs and internal processes and exhibit time-binding capability. Sequential processing has been shown capable of doing whatever a parallel computation is capable of given enough time, but therein lies the catch. The extra time required may be long enough for real-world inputs to change before the computation halts. The actual number of parallel processes needed may be negotiable within differing contexts. Massively parallel systems may not be needed except in certain situations.
5) Memory - It must be able to form a historical context of its own inputs and operations.
6) Projection - It must be able to speculate, make predictions, and reason with limited or insufficient input.
7) Graceful Failure - Errors, mistakes etc. should be allowed to occur gracefully without bringing the whole system down.
8) Scalability - A UIM should be able to easily expand or modify its abilities and resources in a way that enables adaptation to speed, efficiency, and/or size constraints. It should be able to tailor itself to deal with any class of problem as needed.
The first ability above - Free Will - is very important. It is the single most important difference between a UTM and a UIM. Robotics researchers tend to use the term autonomous to avoid the psychological associations, but I think it's time to embrace the psychological. Emotive potentials could inform and direct the exercise of free choice and allow for human control from outside. The halting problem could then be controlled by simple heuristics and command pathways could be balanced against proclivities to engage in play or free choice behaviour. Such a system might even be interactive the way a child is i.e. it tends to do what you say although you might have to yell at it from time to time.
I presented the idea in other papers to use a simple random or complexity rule set to autonomously generate free choices when asked to do so. Awareness or montioring of internal states and emotive potentials would be required to judge when such access is needed to override and produce a concurrent halting of some computation.
Universality is a much more difficult problem. The rest of this paper will discuss this issue. The remaining abilities will be reserved for future discussions.
Computers are notoriously brittle if they dont get the right kind of inputs. You cant just willy nilly expose a computer to raw inputs. Certain assumptions have to be made about data coming in, its source, and format. Movements of the mouse have fixed connotations totally different from the keyboard for example. The coding system and byte boundaries have to be known. A single bit error can crash the entire system. All of this adds up to UTMs that are basic dumb deterministic machines. An intelligent machine has to do something fundamentally different.
I have shown with my work on generalized logic functions that functionality places googolplexity requirements on representational space. Direct computational lookup simply is not feasible except in the very simplest of cases. This means that functionality has to be truncated somehow if universality is to ever be feasible at all. Neural nets and similar approaches seek generalization via distributive computations, but in all cases the inputs still have to be strictly formatted. Indeed it is common when using these types of computations to massage the data prior to input. Even then, intensive training is required before such a system can classify inputs or linearly separate a particular problem space. To make things worse, such systems require extensive structural tinkering to adapt the system to particular problems.
A radically different approach appears to be needed. I think compression algorithms give a possible direction to look at. The goal of compression is merely to reduce the size of some chunk of information. We are not so interested in size here but it does make one perk up and notice it when he recalls what we said about representational requirements of functionality. If compression can relieve the representation problem a bit that would be a boon, but of most interest here is how compression algorithms do their magic.
Compression always has a model for its magic. Various models are run-length encoding, Huffman frequency analysis, dictionary lookup, quantization, exponential squashing etc. What is common to each method is that they exploit certain regularities in the data which enables computative reduction. A reducible computation replaces the expanded data set with a smaller set of codes. The resultant set or file then includes a string of reduced codes plus the meta data required to reconstruct them. Compression meta-data can thus be seen as data converted into information - information about the original data set. A simple measure of randomness is commonly expressed by the degree that a particular data set can be compressed. Perfectly random data cant be compressed at all. This measure of randomness is, of course, always dependent on the particular compression model being used.
Inspite of the fact that compression is always dependent on model and hence is not general or universal, might we still not extract this meta-data as information about the data? It would exist as information with a context - the context being the particular extraction model or filter used. One should also point out here that these compression models make very few assumptions about the data itself. They only make primitive assumptions about the lowest level of input i.e. byte boundaries and big endian vs little endian values etc. The resultant metadata can be seen as similar to a pattern recognition system that identifies patterns by the filters they resonate to.
The actual data does indeed influence the compression ratios etc. and different models are primarily designed for different types of data, but I think one can still conclude that a mixed bag of compressive extraction techniques might be able to universally at least make some informative conclusions about any unknown data set. An intelligent system that could piece together an impressive array of such meta-data profiles might be a first step toward achieving the ability to handle all kinds of input at will. Such a system could literally build a model of the data and ultimately a model of its total experiential world.
Such a system could also model its own internal states. At some point compressive meta-data extraction and modeling could be linked to symbolic manipulation as an associative indexing ability. Memories could be encoded as reduced contextual codes and imagination could be implemented as randomized expansions.
I have been harping on the fundamental significance of compression for years but have as yet not aroused much interest in this idea. Maybe this brings it all a little more into focus. Let me know what you think.
By the way, I judge interest in my ideas, not by the compliments, but by responsive segways into possible implementations, applications and related areas of interest.