Difference between revisions of "Roguelike Intelligence - Evolving State Machine AIs"

From RogueBasin
Jump to navigation Jump to search
 
 
(One intermediate revision by one other user not shown)
Line 252: Line 252:




David Dillinger
Ray Dillinger
</pre>
</pre>


[[Category:Articles]][[Category:AI]]
[[Category:Articles]][[Category:AI]]

Latest revision as of 15:12, 29 March 2011

Roguelike AI, article 3B: Evolving state machine AI's.


Last time, I introduced genetic algorithms and gave a brief
guide to troubleshooting them, then presented some of the
issues involved in using a genetic algorithm for evolving
stateless AI's of the type covered in article 1.  This
article is about evolving state machine AI's of the type
introduced in article 2, and a little bit about the
limitations of genetic algorithms.

A state machine AI is, fundamentally, a state transition
table.  You can think of it as a 2-dimensional grid where
inputs are mapped against states and each mapping gives a
transition to another state.

The "intelligence" of state machine AI's is bound up in
three things: The transition table itself, the intelligence
of the stateless AI that's used in each state, and the
decisions about when to give what input.

In my earlier article on state machine AI's, I focused on
the state transition table: It's central to the idea itself.
I limited the input signals that could happen in a given
state by giving the monster different sensory capabilities,
or a different "observe" function to call in each state.
However, within the limits of what could be observed, I was
hand-coding when the system decided to issue a state
transition signal.

Now, when you're doing things by hand, you have a clear idea
of what you want the monster to do and the state machine,
including the stateless AI's, the transition table, and the
decisions about input, follows from that.  But if you're
evolving monster behavior instead of hand-coding it, you'll
have to take another look at how to decide when to issue a
state-machine input.

You have to look at the inputs available from the dungeon
and from the monster's perceptive power at the moment, and
you have to pick which of several signals to issue....  wait
a minute.  Isn't that basically the same thing you were
doing with a stateless AI?  It is, isn't it?  With a
stateless AI, you built a tree of decision points and input
parameters, with the leaf nodes leading to actions.
Deciding which state machine inputs to turn on is exactly
the same except that the leaf nodes are signals instead of
actions. So you can use the same structure and method now,
building a tree of decision points where the action is
picking one or more a state machine input signals to turn
on.

We've already got a good method for combining "genetic"
material from decision trees; we just use branch swapping,
the same technique we use for the stateless-AI of each
state.

But whate about the structure they'll be controlling?  That
pretty much has to co-evolve with the controls.  what
remains is the transition table itself.

But tables are easy to evolve in a genetic algorithm; this
is just an array, so we set up a fixed-size block in the
genome for it.  Each state invokes a stateless-AI with a
given set of parameters: probabilities, thresholds, and
default inputs.

In the state transition table we have the following: For
each state, there is a designated stateless-AI to use, and a
set of parameters for it designating its behaviors and tests
from the set available to the monster.  Each state also
contains a set of tests (again from those available to the
monster) to use as parameters for the input-selector.  For
each state, for each input, there is a datum naming a state
and a function that gives a probability of transition.  I'd
recommend that these functions be limited; you could use a
constant, or use the output of one of the monster's
available tests, or a simple function of a test and a
constant or a simple function of two tests.

And, based on these analyses and assumptions, we're ready to
make a gene map, a combination function, and a mutator for
our state machine AI.

To make a gene map, we're going to have to make some
decisions in advance about the morphology of the state
machine; For example we can say a priori that it has sixteen
inputs and sixteen states.  That means our genome needs one
input-picker, 16 stateless-AI-with-parameter sets, 16 sets
of parameters for the input-picker, and 256 transitions with
transition probability functions (see why I advocated
keeping them small and limited?).

So, assuming we develop the stateless-AI's in a separate
step, and allow a thousand nodes (each about two words) for
the input decision tree, a hundred words per state for
stateless-AI parameters, and four coded words per transition
function, we get a "genome" of about 8 thousand words, or 32
kilobytes per individual. The combination function will use
branch swapping for the tree, crossover within each
stateless-AI parameter set, and crossover within the table.
On the principle of storing things whose influence on
fitness depends on each other close together, the table will
be in row-major order with the stateless-AI parameter set in
the middle of each row (so that a state's state transitions
remain as close to its behavior definition as possible).

32 kilobytes is very long, so you won't be able to support a
huge population on a given machine.  You have to have a
reasonable sized population (at least 50 to 100 individuals)
or it just won't work.  In order to work with genomes this
size and a small population, you need to use very low
mutation rates and a very low degree of elitism.  And one of
two other things.  Either you need a whole lot of patience
(reconciliation to runtimes of a few weeks or longer) or you
need a cluster.


Remember the fitness test we used with the stateless AI's?
Basically, it amounted to this; considering the moves that
each of the candidate AI's would make in a given situation,
which virtual monster would have wound up better off?  With
state-machine AI's there is one additional complication; we
have to know what state the monster is initially in.  But
given that additional information as part of the
"situation", it becomes the same problem as evolving
stateless AI's.

The "goodness" function for the situation can be the same
one you wrote when you were evaluating stateless AI's; just
take every possible thing you can think of, decide how
"good" or "bad" it is according to this monster's goals in
life, and combine them to get a total score.  The
difficulty, as usual, is projecting ahead.

If the "goodness" function is very good - if it takes
everything your monster cares about into account and
measures its situation realistically, then you don't have to
project it more than one turn into the future, and you can
do that without necessarily simulating the player.  The
"best" state-machine AI, in that case, is the one that most
often produces the highest "goodness" score available from
any move.

This brings up an important point; what we've been trying to
achieve so far is stateless and state-machine AI's, because
they are fast and unambiguous ways to decide what to do.
There may be scores, or hundreds of actions available to a
given monster at a given moment, and we've been trying to
avoid the need to "try and test" each and every one and see
what kind of goodness score we get from doing it.  That kind
of test is acceptable in _evolving_ a state machine, in
order to "grade" state machine quality by how closely or how
frequently the state machine matches it in fitness; but so
far, we've been assuming that it's more CPU horsepower than
we want to do for each and every monster at runtime.  The
whole point of the GA is to use the expensive function
(projection plus measuring goodness of the situation) to
evolve a cheap-to-compute function that gives us results as
good.

If on the other hand you decide that it is acceptable to
spend the compute time at runtime, and you don't want to sit
around for weeks while a genetic algorithm converges, then
by all means throw out the state machine and substitute the
function that projects the results of actions and decides
which resulting situation is best.  This will make the AI in
your game a lot "heavier" in terms of CPU, but for most
"reasonable" goodness functions it can still be managed. And
in fact it's the beginning of a reasonable approach for
modeling the player, so we'll get to it in article 4.

The limitations of Genetic algorithms are several; first,
they take a whole lot of computing power to arrive at a
conclusion (remember, "patience or a cluster").  They're
finicky and often mysterious and usually multiple runs are
needed as the earlier runs point out problems with the way
you set up the problem or with the parameters you're running
the GA itself with (this is a motif you see a lot of with AI
code).  They require you to set up the gene map and say how
it will be interpreted (which forces you to make decisions
about, for example, number of inputs and number of states,
whether or not you have a very clear idea what the results
of those decisions are.

GA's produce systems that are usually incomprehensible and
always at least nine or ten times more complicated than they
need to be; google for "Emergence of Introns" in genetic
algorithms if you want the long version of the story, but
the short version of the story is there's got to be a bunch
of apparently-useless complication in there before the
evolutionary process really starts to work.  We can expect
introns (complex structures that serve no useful purpose in
the end-result but which serve to make evolution via our
mutation and combination functions easier and less risky) in
our decision trees and in our stateless-ai's, and if we
don't hand-optimize them, which is substantial work, we can
expect that the result will spend about 3x as long as
necessary to run.  We can expect introns in our state
transition table; look for sets of redundant states that
clearly serve the same purpose/produce the same behavior and
often transition from each to the other wildly. It will
still be ten times as fast as the try-all-possibilities and
score the results of each method, but if you hand-tweak it
to get rid of introns, you can make it both faster and more
comprehensible.

And anyway, that, finally, is getting close to all we can do
with state machines and stateless AI's.

Trying to project a "goodness" function more than one turn
into the future, of course, brings us up against the same
hard rock that we ran into when we were developing
stateless-AI's; in order to do this well, we are going to
need something that simulates the player.  And that is
hard.




** A note about clusters and small-scale distributed
   computing **.

When I say "cluster" I don't really mean one of those
amazing special-built clusters and distributed computing
infrastructures, although one of them could certainly be
used to advantage.  Instead, I would advocate building a
tiny console program to run the Genetic Algorithm as fast as
possible, and, while running it, once a minute or so, submit
an HTTP request containing a few genomes, and get an HTTP
response containing a few other genomes.  The server runs a
simple script that responds to each request with a random
set of a few genomes it's recieved in the last hundred
requests it handled. Exchanging a few genomes now and then
is really all that the system needs to keep the populations
connected and allow examples of new genes and complexes
developed elsewhere to drift between them; it's analogous to
an "island chain" where a few individuals drift between
islands occasionally.

Once you've got your GA program and your server script, find
a LAN where you can set it up and run it.  You can do it at
work if you get permission first; you can either ask your
coworkers to run your little console program and plug in a
laptop to act as the server, or you can take over the office
LAN while the office workers are away for Christmas
vacation, or whatever.  Don't do this without permission at
a job you want to keep.  A good relationship with coworkers
and boss is needed.  Or, you can just hook up all the
computers in your house and wait for it to work.


Ray Dillinger