IEEE 1

 

Home
Up

Copyright © 1997, Institute of Electrical and Electronics Engineers, Inc. All rights reserved.

This article was published in IEEE Intelligent Systems magazine, July/August 1997

Creatures: An exercise in Creation

By Stephen Grand

Take one ordinary laboratory rat. Slice it half and watch. The two parts may squirm for a while, but soon they’ll stop moving forever. Why is this? The answer, of course, is that some of the bits that the top half needs are now disconnected in the bottom half and the rest are lying in a sticky pool on the bench. The original rat was a tightly integrated network of multiple sub-systems, and all those parts were needed in order that the creature could live. There is no such thing as half an organism.

So why do we create neural networks that have no chemistry? Why simulate genetics without a digestive system? There is no such thing as half an organism, and yet most attempts to generate intelligent or life-like agents are based essentially on a single mechanism. Granted, a few people are working on evolvable neural networks, but the "genetics" is seen as a means to an end, rather than a part of an integrated, heterogeneous system. Neural networks themselves are generally homogeneous entities, too—one kind of neuron, interconnected in one or at most a few different ways and performing a single task. Yet we know that our brains are not like that at all: they are divided into regions that perform many different tasks and contain populations of very different cells. What is more, artificial neural networks generally employ only direct synaptic connections for their signal paths, despite our awareness that real brains are swimming in neurotransmitters that have diffuse and plural functions.

There are very good reasons for all this, and I mean no insult to researchers who work on such single-mode approaches. For a start, real living systems are so messy that biologists have enormous trouble understanding them. Researchers using artificial analogs of such natural processes understandably want to simplify the object of their attention, so that they can learn the underlying rules that make it work. Secondly, creating a homogeneous neural network to perform a single task is hard enough, without needing to build multiple, interacting systems whose overall behavior is extremely complex and hard to debug.

Such reductionism is perhaps compulsory for a scientist, but for an engineer there is a serious risk that a homogeneous approach will fail to deliver the emergent richness that comes from the interactions of heterogeneous complexes. The First Law of Biology is that "Nature is lazy"; she does the minimum necessary in order to achieve any given effect. The fact that organisms are combinations of many different processes and structures suggests that most of those systems must be necessary, and we should heed this in our attempts to mimic living behavior. Taking a holistic approach and attempting to create such a "whole" organism may sound a brave or even foolhardy task. However, let’s explore the possibilities and see what could be done. How does one go about creating a complete living thing, exactly?

Igor, hand me that wrench

Let’s begin with a brain. We need our creature to perform a number of "mental" tasks, such as controlling its focus of attention, processing sensory data, selecting a course of action and sequencing the steps needed to carry out that action. Let’s not be too ambitious: let’s assume we’re dealing with a "virtual" creature—a softbot. That means we can skip all that tricky visual pattern recognition and so forth, and provide fairly abstracted sensory data to the brain. Let’s use conventional algorithms to provide rich but abstracted visual, auditory, tactile and other general environmental and internal state data—perhaps a hundred or so separate data streams in total. Let’s also cheat with the sequencing of actions, and allow the individual muscle movements and environmental interactions to be sequenced by means of "action scripts", rather than attempt to develop a neural system of linked oscillators or some such means of motor control.

That leaves us with focus of attention and action selection. Our neural network brain thus has to perform two overall jobs (probably comprising several distinct sub-tasks), and so must be a heterogeneous structure. It also has to be capable of some kind of reinforcement learning which, because we are trying to create a believable agent, needs to be unsupervised and realistic. Our agent needs to be capable of learning appropriate actions in a complex sensory environment. Such a system is asynchronous, and so there is no obvious relationship between the timing of any reinforcement and the actions that were actually responsible for it. This is a messy, noisy virtual world, and so we can’t expect to compute reliable error terms for credit assignment, either. Finally, we want our creature to learn by its mistakes, but in a realistic way. We don’t want to see it making entirely random decisions in the face of novel situations—it must be capable of forming generalizations from earlier, related situations. Oh, and I almost forgot: let’s force ourselves to be purists and demand that our neural structures are biologically plausible, specifically in the sense that they contain no top-down constructs. The individual neurons must be capable of processing all their functions autonomously, without any part of the system needing to know the state of the whole network (there are good, practical reasons for this, which I’ll come to later). It doesn’t look like we can pull a ready-made ANN topology off the shelf to do all this, so it seems we’re going to have to design our own from scratch.

Nerves of Silicon

Since we need the network to perform multiple functions and sub-functions, we’d better start by designing some rich and highly configurable neurons to act as a flexible toolkit from which to construct the various parts of our brain. We know that real brains are dynamically very lively, almost chaotic in fact. However, we need to design a system whose behavior we can predict with reasonable confidence. So, let’s introduce some damping into the system at the earliest possible stage, so that the network isn’t liable to lock up or thrash as a result of positive feedback or an insufficient dynamic range. Suppose we design our neuron to act like chewing gum—stretch it and let go (electrically speaking), and the neuron will relax back to its rest state, rapidly at first and then more slowly. The further it is disturbed, the faster it tries to relax, and the harder it becomes to push it any further from equilibrium. Not only will this damp the system but it also allows our neurons to act as integrators, sensitive to the frequency as well as the amplitude of the input signals (Figure 1).

We can pass the current state value through a threshold, to provide an output signal in the time-honored way. Deflecting the state value in the first place will usually mean summing the inputs to the cell and adding that value to the state. However, we might want to do fancier things than that, so let’s make the neuron’s state function configurable, so that we can set up comparators, gates and other structures if we need them. Since we are in the business of creating a "whole organism", we are going to have to define our neural structures using simulated genetics. Therefore, let’s specify those state functions by inventing some clever expression syntax, which can be genetically implemented, rapidly interpreted, is relatively un-brittle and immune from syntax errors caused by mutations.

Finally, we need to worry about the structure of our synapses. These are going to have to be rather more complex than the simple Hebbian weights used in most ANNs. For a start, we can assume that synapses and even neurons are going to be in short supply, and we are going to have to manage them as a limited resource. Dendrites are thus going to have to be capable of migrating and forming new connections, and old, unwanted connections will have to atrophy and disappear, to free up resources. So let’s give each synapse a "strength" value, and supply some rules about how strength increases or decreases under different circumstances. Luckily, we can use the same genetically programmed expression syntax that we defined for the state function.

We’ll use Hebbian weights to modulate input signals, but we need to be a bit clever here, too. Suppose our creature puts its hand into a hole, and that hole happens to contain a crab, which nips the creature’s fingers. How much should it hurt? If the reinforcement is too strong, the creature will never put its hand into a hole again, even if no other holes contain crabs. If it is too weak, the forces which were acting to recommend that action in the first place (hunger, perhaps) are likely still to be operating, and the creature is likely to take the action again. How can we create statistically valid learning, and yet ensure that the network does not repeat actions in a most un-lifelike way? Well, let’s employ the chewing gum strategy again, and define a system that can learn strongly from a single reinforcement episode, but quickly forget most, but not all of that experience (Figure 2). The "short-term memory" will thus prevent embarrassing repetition, while the "long-term memory" enables the system to learn from the probability of reinforcement as well as its intensity.

Building a brain

So there we are. We have a flexible and dynamically rich definition for a neuron, and we can distribute those neurons in clusters (let’s call them "lobes") which interconnect with each other and perform the necessary functions of our brain. How is this brain going to work? The attention director is easy: we know that we can set up neurons to act as integrators, so let’s build a lobe of cells configured this way and send the appropriate cell a nudge every time an object of that type makes a sound or moves. We can simply detect the most active neuron to direct our creature’s attention to the object currently making the most "fuss". The sensory system will then feed the brain richer data about this specific object, and the action sequencer will treat it as the object to be acted on.

But what of the learned action selection mechanism? Explaining this is a bit like describing a jet engine. I can state how a jet engine works in a single sentence, but you only have to look at all the pipes and valves in a real engine to see that simple ideas often need complex implementations! In this article, I only have room to give the brief version1:

Imagine a "perfectly knowledgeable" creature, i.e. one that knows exactly what best to do in each perceptibly different situation. Such a system has quite simple requirements: a) it must be able to discriminate between those sensory situations, and b) each such "sensory schema" must "recommend" the most appropriate action to take. Of course, there is no such thing as a perfectly knowledgeable creature (such a thing would be boring in this instance anyway, since the ability to learn from mistakes is a crucial characteristic of living organisms). Therefore, a practical creature must be able to form or modulate the connections between a perception and an action, in such a way as to discover and remember the most appropriate action for each perception (a memory of "relationships"). What’s more, a practical creature with a fair range of sensory equipment can potentially find itself in a vast number of perceptibly different situations—far more than a practical ‘brain’ would be capable of representing. Therefore it must be able to store only those situations that it actually finds itself in during its lifetime, or perhaps only the ones which have turned out to be significant in some way (a memory of "events").

These tasks can be performed by two brain lobes (Figure 3). The first one has a large number of cells (let’s say six hundred), each of which only fires when all of its inputs are conducting. These cells can form and re-form their connections with the sensory lobe to store memories of significant events. The second lobe has only a small number of integrator cells, one for each action the creature can take. However, these cells are highly dendritic, because they need to form connections with the large number of outputs from the previous lobe. Again, dendrites are a scarce resource, and so they need to be able to migrate and maintain only the most useful connections. Any given sensory input pattern will cause one or more of the first type of cell to fire. These outputs will contribute to or inhibit firing in one or more action cells, depending on how beneficial or harmful a given situation has, in the past, proven a given action to be. Learning in such a system thus requires weight adjustments to these latter connections, in response to reinforcement episodes, plus synaptic atrophy and migration to maintain only those memories that prove significant.

Of course, learning by one’s mistakes is not enough in itself: "Oops, maybe stepping off a cliff wasn’t such a good idea, I’ll remember that next time" isn’t a very useful thought as one plummets earthwards! It is important that in novel situations the network is capable of generalizing from past, related situations. Happily, this is a natural consequence of the "sensory schema" model described above. There is no way for the creature to know, until it has had later confirmatory experiences, which of the many active sensory inputs are relevant at any moment. For example: am I in pain because I saw a moving truck and I stepped towards it and it is Wednesday, or because I stepped forward and it is Wednesday, or just because I saw a truck, or what? For that reason, all permutations of the current inputs must be represented simultaneously in the perception lobe. Now, two sensory schemata can be considered similar if they share one or more inputs. For example, "I see something approaching, it is a truck" is related to the simpler situation "I see something approaching", which may be remembered from a previous experience, such as "I see something approaching, it is an attractive blonde". In the past, the creature may have learned that a good thing to do to approaching blondes is to kiss them. It thus also believes that a good thing to do to approaching things in general is kiss them. Therefore, when it first meets an approaching truck, it has a rational (if rather unfortunate!) idea about how best to respond to this novel situation. Of course, when the inevitable happens, the pain of that experience will teach the creature that approaching trucks are not to be kissed, while approaching blondes are still eminently kissable, and the simple fact that something is approaching is not a good indicator either way! Such a mechanism, despite these occasional over-generalisations, is much more realistic than one in which novel situations elicit only random actions, and it will most often lead to the creature doing something sensible in the absence of reliable knowledge.

I’ve glossed over a lot of important detail here, but this brain model seems fine for our purposes. However, what about this business of supplying reinforcement for good and bad choices of action? How are we going to implement that? The system has to learn unsupervised, and there are no "right answers" to compute error terms from, so somehow the events themselves must inform the creature whether an action or sequence of actions was wise or unwise. Simple pain and pleasure don’t seem to be enough, as real life is much more multi-dimensional than that. Suppose we implement reinforcement by measuring the effect of external events on a set of "drives". For example, any action that increases a creature’s "pain" drive is bad, and must be punished, while anything that decreases its "hunger" drive is good, and should be rewarded. This allows us to have a rich and intricate interplay between events and the learning process. For example, supposing a creature that chooses to bounce a ball finds that the experience reduces its "boredom" greatly, whilst increasing its "hotness" and "tiredness" drives somewhat. If the creature was bored before bouncing the ball, the net result of those drive changes will be a reward. On the other hand, if it wasn’t bored, then no further boredom reduction can occur, and the net result of the action is a rise in tiredness and hotness, which results in punishment. Thus our creature can learn to distinguish between good and bad circumstances in which to play ball.

Chemical Soup

Now, you may be wondering when I’m going to get round to this business of a multi-systemic approach to AI. This is that moment! Suppose we represent each of those drive levels by the concentration of a simulated "biochemical". Furthermore, suppose we implement the punishment and reward mechanism by means of "chemical reactions" between drive chemicals and drive reducers or increasers. Suppose the result of those reactions is a rise in the levels of a punishment chemical or a reward chemical. Isn’t that just applying fancy terminology to a straightforward piece of arithmetic?

Well, no. The concept of a "chemical model" is surprisingly potent. Networks of chemical reactions are similar in many ways to networks of neurons. However, a change of metaphor implies a change in characteristics. Chemical reactions are slower than electrical signals; they are also diffuse conductors of information, rather than directed ones. Finally, the metaphor allows us to consider new dynamical constructs, such as "catalysis". So, let’s add a simple chemical modeling system to our agent. Modeling detailed protein behavior is perhaps a bit too complex and hard to predict, so let’s define a model in which chemicals have no intrinsic properties, but their interactions are defined by "reaction" objects: A+B®C+D, where B, C and D are optional. This allows us to simulate substitution, fission and fusion reactions. It also allows catalysis (A+B®A+C). Let’s make the dynamics fairly realistic, by making the instantaneous reaction rate dependent on the concentrations of the reactants. While there is a lot of A and B around, the reaction will happen quickly, but as A and B get used up, the rate will slow down.

How are we going to interface this chemical system to the rest of the simulation? This is where the model gets clever. Let’s invent some chemoemitter objects, which secrete chemicals in response to certain activity, and chemoreceptor objects, which cause events in response to chemical levels. We can make these objects quite complex, giving them thresholds, attenuators, nominal values and suchlike. Most importantly, we can give them the ability to bind onto arbitrary bytes in our code. A chemoemitter thus emits chemicals in proportion to the value currently at that address in memory, while receptors change the contents of the byte they’re attached to in response to the level of a given chemical. Because these bytes are at arbitrary addresses, they can represent anything we like. For example they might be the addresses of parameters or buffers inside neurons! By attaching a "punishment"-sensitive chemoreceptor to an appropriate locus on a group of synapses, we can thus generate the necessary signals for reinforcement. Receptors and emitters attached to synaptic Strength parameters might similarly allow us to set up chemically driven feedback loops for controlling synapse atrophy. If we decide to model some form of arousal level ("sleep", perhaps) in a lobe of neurons, we can do it by simply attaching chemoreceptors to their Threshold parameters and defining the necessary chemical reactions!

Drive modulation need no longer be simple and direct. For example we can modulate the Hunger drive through a complete chemical model of the digestive system. Let’s make the ingestion of a food object release "starch" into the "bloodstream" of our agents. We can define a reaction that converts starch slowly into glucose, and another pair of reactions that reversibly store unused glucose as glycogen, forming a long-term energy store. By attaching a chemoemitter to a byte containing data about the number of muscle movements the creature has made in any one period, and making this emit an "enzyme" which converts glucose into CO2 and water, we can model respiration. Attaching a glycogen receptor to the appropriate byte in the code can cause our creature to "die" if its stored energy declines to zero!

Having successfully given our agents a digestive system, let’s go on to give them a reproductive system too, as we will soon be adding genetics, and we’d like our creatures to breed. Simply define testosterone, estrogen, progesterone and so on. Attach an estrogen receptor back-to-back with an estrogen emitter to make an oscillator and we’ve got a fertility cycle for our females. A chemoreceptor sensitive to estrogen levels can be used to control egg production, and the whole reaction network can be connected to a sex drive chemical to allow mating behavior to be integrated with the brain.

It’s maybe getting a little carried away, but what about an immune system? If we have bacteria in our virtual world, we can coat them with different "antigens" and define some "antibody" reactions for our creatures. Again, receptors and emitters provide the interface between chemistry and physiology. As the creature builds up an immunity to an infection, the bacteria succumb to "poisoning" by the antigen. Of course the bacteria can also emit "toxins", because our creature’s reaction network is now so complex that there are many ways in which a "foreign" chemical can disturb the system and cause symptoms of "illness". We can even model histamine production and make our creatures sneeze, passing on infections to those nearby. Oh, and let’s allow the bacterial population to evolve over time, adding a little co-evolutionary spice to the proceedings!

Given the disturbed reactions caused by these toxins, it’s not difficult to imagine corrective "medicines", which help to re-balance the chemistry or suppress symptoms. And let’s also simulate "stress", by emitting adrenaline in response to high drive levels, which interferes with the efficiency of other parts of the system, perhaps even the brain. I could go on and on, adding further analogs of real physiological systems. However, let’s stop there and leave something for evolution to discover later.

Mendel’s Marvellous Mechanism

Rather than hard-code these structures permanently into the program, let’s define a genetic system to configure it all for us. Since every object, be it neuron, chemoreceptor or reaction, is completely autonomous, carrying out its own functionality without regard for the part it plays in the whole system, we can construct and configure these objects simply by defining a class of "gene" for each kind of structure. A chemoreceptor gene thus specifies all the parameters needed to construct a receptor object and describes what locus to attach it to. We’ll give each gene a header, in which we can set flags to control whether the gene applies to males, females or both, and at what stage in the creature’s life cycle (which can be chemically controlled, of course) the gene switches on. Genes that switch on later in life can replace or supplement pre-existing structures, according to the type of gene. For example, new reactions can be expressed at "puberty" to switch on a creature’s reproductive system.

We can assemble a string of these genes into a single "chromosome", which completely defines the neural and chemical (and morphological and postural…) structures of our creature. When creatures mate, these chromosomes can be crossed over to produce offspring that inherit their complete definition from their parents. Occasional random "mutations" can add extra variation to the gene pool, and "cutting errors" can be allowed, which lead to dropped or (more importantly) duplicated copies of genes. By such duplications and subsequent mutation, new chemical or even complete neural structures can arise. Notice that these genes do not code for behavior, but for deep structure—the behavior is an emergent consequence of this structure. Our creatures can thus truly evolve.

Proof of the Pudding

So there we have a design for an autonomous, intelligent creature, capable of learning by its mistakes in a complex, noisy world. It needs to learn to consume food in order to survive. It has a complex network of drives and needs. It has an immune system and a reproductive system. It can breed with other creatures (as long as their reproductive chemistry is compatible, i.e. they are of the same species), and when it does so, it is capable of open-ended evolution, which can result in new neural and physiological structures that its designer probably couldn’t even conceive of. Oh, and I almost forgot: by attaching a simple text parser to the creature, and passing nouns into the attention directing lobe, while verbs go through the episodic memory and action selection lobes, these creatures can be made to understand (and learn) a simple spoken language.

So, what do you think? Is this idle speculation? Is it too ambitious, given present computer power and the state of knowledge, to attempt to create such a creature? Would it work if we did? Well, in case you hadn’t already realized it, I should explain that what I have been describing does already exist. It is a real project, a finished commercial application, and yes, it does work. The system I have described is part of a computer entertainment product called "Creatures", the development of which has occupied my time (and latterly that of a team of programmers and artists) for the past four years. Creatures is a program that allows people to keep small communities of little, furry, virtual animals as "pets" on their home computer, in much the same way as they might keep hamsters or fish. They can teach their creatures words and talk to them (they talk back), smack or tickle them (to reward or punish them directly), bring them objects to eat or play with, encourage them to travel on foot and in vehicles around their virtual world, and generally help them learn how to look after themselves. If they get ill, the user can care for them and try to diagnose and treat the disease. Eventually, the creatures get old and weak, and finally die. However, if they live to puberty, the user can encourage them to breed and produce offspring. He or she can then swap those offspring with other Creatures enthusiasts over the World Wide Web, via the thirty or so hobbyist Web sites devoted to the product. There are probably a million or so of these creatures in existence at any one time, and the gene pool is increasing all the time.

Creatures was launched in Europe and Australia in November 1996, and comes to the USA in August this year. However, I didn’t write this article in order to advertise my company’s product, but to make some points: Firstly, although the "space of all possible machines" may contain many regions in which intelligence can be found, we don’t know where most of those regions lie. We do, however, know that many biological machines are intelligent, so copying Biology is clearly a rational thing to do in our search for artificial intelligence. Secondly, there is no such thing as half an organism. Real, living systems are heterogeneous complexes of feedback loops made from nerves, chemical reactions and genes. We should take notice of that fact and recognize that it may be highly significant. Particularly, neural networks should incorporate diffuse, as well as directed signal paths. Thirdly, I would like to assert that such fully rounded artificial organisms might have practical benefits. It is not immediately obvious why an intelligent video recorder, say, should need an immune system, but if it had one, I’m sure we could think of a use for it. Real organisms are robust, self-healing and adaptable; perhaps fully rounded synthetic organisms will share those characteristics. Lastly, I hope I have demonstrated that such "grand syntheses" are possible today and that, with care and understanding, very complex systems can be constructed and made to work, in real time on fairly lowly computer systems.

Once upon a time, all machines were propelled or controlled by fully rounded, living organisms. Ploughs were pulled by horses, which, unlike modern tractors, could steer themselves, refuel themselves and even reproduce themselves. Automation has replaced these subtle creatures with strong but stupid, inflexible slaves. I, for one, look forward to a day when we can "put the life back into technology"—not quasi-intelligent, monolithic expert systems, but fully-rounded, thinking, caring, even reproducing organisms that are more than the sum of their parts. Creatures is just a toy, but I hope it counts as a small step towards that goal.

I’ll leave you with a Creatures anecdote: Once, one of our program testers placed a whole row of Creatures eggs next to the incubator machine, planning to hatch them later. In the mean time, he traveled off to a distant part of the virtual world to check on another creature. When he returned a while later, he discovered the incubator room filled with babbling babies, and no remaining eggs. It transpired that an adult creature had entered the room while our tester was away, and discovered that if you pick up an egg and drop it into the incubator, you get a new friend to play with! More than the sum of their parts indeed!

References

1: S. Grand, D. Cliff, A. Malhotra, "Creatures: Artificial Life Autonomous Software Agents for Home Entertainment", Proceedings of the First International Conference on Autonomous Agents, ACM Press, New York, 1997, pp 22-29

 
Copyright © 2004 Cyberlife Research Ltd.
Last modified: 06/04/04