logo

banner

Scientific Retreat
Jordan Pollack, Ph.D.
Associate Professor of Computer Science
Volen Center for Complex Systems
Brandeis University
Waltham, Massachusetts
March 27, 1997

Co-Evolutionary Learning

 

While in theory, machine learning techniques could achieve intelligence in artificial agents, in practice, the setup for learning has required enormous amounts of programming, as much or more than would be involved in producing a direct solution. If the sophistication of algorithms necessary for achieving cognition has been underestimated, then the critical basic research question becomes one of automatic development and maintenance of very complex software structures.

There are now many models of machine learning, which are driven both by metaphor and performance criteria. These include models which are inspired by psychology, by neuroscience, or by evolution, and whose performance is measured by modeling experimental data or by competence on specific tasks.

Unfortunately, as each learning method has matured, we perceive two essential and self-limiting research dynamics:

  1. The algorithms are "improved" through incremental modification based on performance over a set of tasks or benchmarks.
  2. The field "selects" the tasks or benchmarks which best "fit" the method, therefore showing off its performance while not explicitly displaying its inductive bias.

Thus, there is a fundamental problem in machine learning which was noticed first by Doug Lenat: a system can learn only that which it almost already knows. A learner converts knowledge perceivable in its environment into knowledge expressible in its internal structure. The words "perceivable" and "expressible" when applied to a human or animal (even to invertebrates!) describe robust systems, but for computer programs, perceivable means "arranged in a precise syntactic form, parsed and ready for input" and expressible means "within a small search space over constrained parameters of the model class."

Because of the need to carefully specify the input form and model class, every ML method converges before achieving the kind of autonomous learning necessary for embedding into agents who face a novel and changing world. There is a new opportunity for breaking through this inductive bias paradox -- "Co-Evolution" -- which involves adaptive learning agents within adaptive environments. In co-evolutionary learning, improvement by the agents on the current instance of a task provokes increased challenges in the task environment, leading to systems which can continuously develop. Our research is focused on the principles by which systems which can undergo a sustained growth in their abilities, rather than on systems which succeed at a given task because of the skill of the programmer developing the inductive bias in the learning algorithm or in the careful representation of the learning environment.

There are several existing feasibility demonstrations of continuous development, which fall under the rubric of "arms races" and "co-evolutionary feedforward loops," but there are only a few key pieces of work to date to understand the potential of open-ended learning: Thomas Ray's TIERRA eco-system of artificial assembly language programs made the first strong claims, but are difficult to evaluate. Axelrod and Lindgren’s work on adaptive Prisoner Dilemma ecologies show the right kinds of long-term dynamics, but there is not enough strategic content in the continuous Prisoner's Dilemma game to build complex programs with. Hillis's work on co-evolving sorting networks and difficult sequences pointed out the idea of relative fitness providing diversity, as well as several interesting directions in the exploitation of SIMD machines. There is also another body of work on co-evolutionary learning, especially on pursuit-evade, or predator/prey games. However, the best exemplars to date are Tesauro's work on self-learning in backgammon (Tesauro, 1992), which we were able to replicate with simple hill-climbing (Pollack, Blair & Land, 1996), and Sims' recent work on co-evolving the body and brains of simulated robots (Sims, 1994).

Karl Sims developed a computer graphics simulator of the physics of robots composed of rectangular solids and simple joints, and evolved complex behaving animated creatures. Sims' virtual robots are clear evidence that under the right simulated conditions, we can automatically develop complex functional forms from simple initial conditions. We are currently building on this work with a simulator for lego blocks, where the results of virtual evolutionary simulations can be converted into physical reality (Funes & Pollack, 1997).

Funes, P, & Pollack, J. (1997) Evolution of Buildable objects. European conference on Artificial Life, MIT Press.

Pollack, J. Blair A., and Land, M, (1996) Coevolutionary learning of Backgammon Artificial Life V, MIT Press.

Sims, K. (1994). Evolving 3d morphology and behavior by competition. In Proceedings 4th Artificial Life Conference. MIT Press.

Tesauro, G. (1992). Practical issues in temporal difference learning. Machine Learning, 8:257—277.