Roguelike Intelligence - Intrinsic Information and State Machine AIs

Roguelike AI, part 2A:

State machine AI for more complex monsters.

Stateless AIs as described in part 1, although they make your life as an
implementor fairly simple, also make your monsters fairly simple.  Stateless
AI's, generally, become easy for the players to predict, and thus detract from
the challenge of the game.

Using state machine AIs, you can challenge the player with more interesting
monsters.  State means that the decision routines of each monster have
recourse to stored information which is both intrinsic (internal to the
decision code for a particular monster) and mutable (changeable by the
decision code in a particular monster).

We have already seen some examples of state in monsters, but these do not
represent state in monster AI's: The "typical AI" presented in article 1B used
information about the monster's level of damage to make its charge-or-flee
decision each round.  Similarly, the decision routine can-attack-player in the
"archer AI" was presumed to check whether the monster still had arrows
available.  In both cases, these are extrinsic information to the monster
AI. These refer to the state of the model of the physical body of the monster
and its dungeon surroundings and equipment; these are things that the AI
itself can't change arbitrarily.

(Note, archers are easier when you just presume they have an infinite supply
of arrows; then you don't have to keep track of this information.  Of course,
then players learn to "farm" archers for an infinite number of arrows...). 

Finally, such things as "morale" and "too-close-to-player" and
"too-far-from-player" are considered to be extrinsic as long as the monster AI
doesn't change these values.  

You can get some interesting behavior in stateless AI's by creatively using
extrinsic information. But state is fairly easy to add, and with a state
machine you can create almost any degree of complex behavior in monsters. 

There are two basic places to introduce state to your monster decision code: I
call them "learned information" and "tactical state".  Learned information is
just that: something the monster has learned in the course of its brief life
and dungeon encounters.  Tactical state is high-level "what was I doing" or
"what is my current goal" information that is used to choose between several
different stateless AI's depending on the current plan.

As an example of learned information, consider an AI for a goblin which sets a
memory variable if that goblin ever sees the player kill another goblin or
stronger monster with a single attack, or sees the player kill more than four
goblins in less than eight rounds.  Before the goblin sees that, it will treat
the player as just another @-hooman, which goblins regard as good sources of
food.  But after it sees that, it will regard the player as a threat to be
avoided, attacked at range, trapped, or ganged up on.  The goblin AI need
never check this variable explicitly; instead, it just calls a routine named
"observe" every round which updates information that its other routines use,
and then it's AI, which otherwise looks just like a stateless AI, is calling
functions like "player-is-too-powerful" which refer to things that "observe"
has recorded.

You can implement "player-is-too-powerful" for a stateless AI too; in that
case it will probably read the character's level, weapon damage, or kill list
in order to make its assessment.  But this is information the goblin shouldn't
have access to; giving its AI recourse to it seems to make the critter
omniscient in some ways.  Of course, this omniscience is no stranger than that
of monsters who don't have line-of-sight or any particular detection abilities
bunching up on the other side of a wall trying to get closer to the player
character, but the omnscient goblin presents the player no way to influence
it.  

What you want is for the player to learn something about cowardly monsters who
attack in packs; if you kill a few of them, quickly, where all the others can
see, then the others will run away or change tactics.  Now knowing this, the
player has a choice to make about how he wants to manage his goblin encounter.
He can switch to a less-damaging weapon, pause a lot in combat, fight a
running battle while giving ground, and (slowly) slaughter the entire goblin
tribe without them twigging to the fact that he's toying with them; or he can
just slice a few of them in half with his artifact sword and let the rest run
away.

If the goblins are omniscient in their player-is-too-powerful assessment, the
player doesn't get this tactical decision to make, and the gameplay gets one
element less interesting.

As an example of tactical planning state, consider a dragon who is encountered
asleep. His tactical state is "SLEEPING," and that means that the only things
he can do in a given round are "sleep" and "wake up".  Now, the instant he
wakes up, his tactical state becomes "enraged" if there's treasure missing,
"hungry" or "curious" otherwise. Depending on which state he enters, he will
use different decision code in subsequent rounds.

You'd write this part of the dragon AI as something like this:


        DRAGON AI (partial)

   State:  Obs: Action:          input L    input M    input H   NULL

   SLEEPING L   asleep-ai        WAKING:1              HUNGRY:1
   WAKING   C   none                        ENRAGED:1  HUNGRY:1  CURIOUS:1
   ENRAGED  C   typical-ai, p1                                   WAKING:1
   HUNGRY   C   typical-ai, p2              ENRAGED:1
   CURIOUS  A   curious-ai                  ENRAGED:1  HUNGRY:1  BORED:0.1
   

The inputs I've labeled L, M, H, and NULL mean, respectively, loud noise,
missing treasure, hungry, and none of the above.  Each state, except for the
WAKING state, is associated with "Action:", a stateless AI that chooses the
action for monsters in that state.  The WAKING state is associated with
"none", which means that, instead of running a stateless AI and performing the
resulting action the system immediately checks further inputs and shifts to a
different state.  Finally, the column labeled Obs: is for Observation, which
defines a creature's alertness in general.  The SLEEPING AI has observation L,
which means it observes loud noises; most of the other states have observation
C, which means they observe anything casually visible.  The Curious state has
observation A, which means all; it observes carefully and intentionally, as
though the dragon had performed a "look" command.

Note that the typical-AI is engaged for both "enraged" and "hungry" behavior,
with different sets of parameters.  This provides two different sets of
similar behavior, but with different details.  This is where the state machine
sets the information that was formerly regarded as extrinsic to the AI for the
stateless AI's. 

The numbers that follow each state name in the table entries are probabilities
of transition. The probabilities of transition are almost all 1, which makes
it nearly deterministic.  We can have more than one input true at a time; For
example, a sleeping monster could wake up to both hunger and loud noise, and
then would he be waking, or hungry? As matters stand, we check inputs from
left to right and therefore he'll be WAKING.

But look what happens when our critter is CURIOUS and gets no input.  Next
round, he has a 90% chance of being CURIOUS and a 10% chance of being BORED.
So this is, actually, an NFSA.  

pseudocode for operating the DFSA looks something like this, although I have
simplified it by pretending I don't have to bother with parameter lists for
the stateless AI's:


acted = 0;
while (!acted)
{
    observe(statemachine[obs][currentstate]);
    shifted = 0;
    for (inputs=FIRSTINPUT; inputs < LASTINPUT && !shifted; inputs++)
    {
       if (input_is_true(input))
       {  /* note that what's stored in the statemachine is an expression, 
             not necessarily just a number. getshiftprob substitute in 
             values from the monster's extrinsic info and solves the expr.*/
          probshift = getshiftprob 
                 (statemachine[input][currentstate].probshift);
          if random() < probshift
          {
             currentstate = statemachine[input][currentstate].state;
             shifted = 1;
          }
       }
    }
    if statemachine[action][currentstate] != NULL
    {
       do_stateless_ai(statemachine[action][currentstate]);
       acted = 1;
    }      
}

Given this code, plus a few mild complications like parameter lists for the
stateless AI's,  You can make arbitrarily complex state machines simply as
arrays of states versus inputs, where each state additionally has an
observation power and a stateless AI to pick actions. Your NFSA's for complex
monsters can just be arrays, simple to store, handle, read from file, and so
on.  For each input you'll need additionally a way to detect that input.

It's common for roguelikes to bypass the need for a general "observe"
function. For example, the dragon might have a chance to wake up, per round,
if the character moves, depending on the character's stealth.  This sort of
thing can be reasonably effective (meaning that fixing it doesn't become a
high priority) but always seems to wind up with flocks of weird special cases
and inconsistencies.

It is more general to mediate things through the dungeon itself: When
something moves, store noise information in the dungeon map with the current
round number, and when something is listening, have it check the dungeon map
for "recent" (ie, since its last action) noise through an observe routine.
Basically, the idea is that any "observable" act should leave evidence in the
dungeon map, and "observe" should pick up that evidence and notify the
monsters of what they have seen, heard, or smelled.  Observe will become
complicated: there's no getting around that.  But it's best, I think, to have
one observe routine, or as few as you can possibly manage, because it keeps
things consistent when everything that's supposed to be related uses the same
routine.  

If you are an OO fan, you may choose to bypass storing information in the
dungeon map itself, instead favoring explicitly sending messages representing
noise or observable acts from one actor to another in the dungeon.  This works
if you are disciplined, but it still manages to leave a lot more room for
unhandled cases than the dungeon-map-as-sensorium model - and the
dungeon-map-as-sensorium model, which is better, can also be implemented in
OO.

If you don't do either of these approaches, having monsters respond to dungeon
events becomes a huge mass of exception-based code with quirky holes and
problems.  For example, in games where the chance for a sleeping monster to
wake up is based on player movement and player stealth, you can have fireballs
from traps go off next to the sleeping monster's head, or a herd of rhinoceri
wearing plate barding tramp through the room, and the monster will go on
snoozing peacefully.  And if, in the process of making all your sleeping
monsters wake-up calls a bit more reliable, you overlook other behaviors that
are (or should be) noise sensitive, you will develop monsters that hear more
when they're asleep than when they're awake.  Many examples of such
idiosyncratic behavior can be found in existing games.

Oh, speaking of existing games, remember what I mentioned about angband's
monsters being _mostly_ stateless?  In fact they have three states.  Angband's
monster AI, expressed in these terms, may look roughly like this:

State:      obs.   Action:        input-P       input-D              NULL

SLEEPING:    C    asleep-ai       ATTACKING:1   ATTACKING:1           
ATTACKING:   A    typical-ai(p1)                RETREATING:dt/maxhp
RETREATING:  A    typical-ai(p2)                BERSERKING:0.05       ATTACKING:1
BERSERKING   A    typical-ai(p1)                RETREATING:dt/(4*maxhp)

input-P means "player has been detected".
input-D means "damaged".
dt is damage taken;
maxhp is maximum hitpoints. 

This allows angband monsters to decide to go on the attack again even 
while badly damaged, which the typical-AI didn't do.  These monsters 
start out asleep, then go on the attack when they wake up. Each round 
they are attacking, they have a chance to start retreating that depends
on their damage taken and max hp.  Once they retreat, they have a chance
to start attacking again, only this time it's berserking, with a much 
lower chance of retreating.  They alternate between berserking and 
retreating for as long as they're damaged.  If one becomes undamaged, 
it will revert to ordinary attacking.  

Observation is very rudimentary in Angband; the "observe" routine (or 
whatever angband uses for its functionality) checks the amount of 
damage the monster has taken, checks for lineofsight to the character, 
and, I think, that's it.  I don't think these monsters are even aware 
of individual attacks, just of how badly damaged they are at that 
moment.  

typical-ai(P1) uses a "too close to player" function that is never true and 
                    a "too far from player" function that is never false.
typical-ai(P2) uses a "too close to player" function that is never false and 
                    a "too far from player" function that is never true.


Now, this looks simplistic, certainly; but remember that a lot of stuff is
folded into typical-ai.  And if typical-ai looked simplistic, remember that
ten times that amount of stuff is folded into the various behaviors and checks
it calls.



Anyway, this is my general architecture for monster AI: 

behaviors and functions are implemented directly in the source language.

Stateless AI's are implemented in terms of a repertoire of interchangeable
behaviors and functions.  The Stateless AI serves as a "pattern" that brings
these things together, but different kinds of behavior can fill in its 
specific elements. 

and State machine AI's are implemented in terms of stateless AI's with
parameter lists, and state transition tables.  In the same way that the 
interchangeable behaviors and functions fit into the stateless AIs, the 
stateless AI's fit, as interchangeable building blocks, into the pattern
formed by the state machine.

State transition tables are implemented in terms of the above routine for
doing transitions and calling stateless AI's, and "observe", which runs over
the dungeon map looking for evidence of specific observed events and changes
intrinsic information that the behaviors and functions of a stateful AI can 
access.

I want to make one point about the "observe" function.  You can spend as much
time as you like writing it.  Even a very simplistic "observe" such as that in
Angband is functional.  But there is always and forever stuff you can add to
it and more patterns it hasn't learned.  You have to pick your own point of
diminishing returns here.

In the next article, before moving on to minimaxing, neural networks and
genetic algorithms, I'll talk about "reactive" intelligence; how to store
information about how to act on the objects that a creature interacts with.

        Bear
Roguelike Intelligence - Intrinsic Information and State Machine AIs

Navigation menu

Search