Roguelike Intelligence - Intrinsic Information and State Machine AIs

From RogueBasin
Jump to navigation Jump to search

Roguelike AI, part 2A

State machine AI for more complex monsters.

Stateless AIs as described in part 1, although they make your life as an implementor fairly simple, also make your monsters fairly simple. Stateless AI's, generally, become easy for the players to predict, and thus detract from the challenge of the game.

Using state machine AIs, you can challenge the player with more interesting monsters. State means that the decision routines of each monster have recourse to stored information which is both intrinsic (internal to the decision code for a particular monster) and mutable (changeable by the decision code in a particular monster).

We have already seen some examples of state in monsters, but these do not represent state in monster AI's: The "typical AI" presented in article 1B used information about the monster's level of damage to make its charge-or-flee decision each round. Similarly, the decision routine can-attack-player in the "archer AI" was presumed to check whether the monster still had arrows available. In both cases, these are extrinsic information to the monster AI. These refer to the state of the model of the physical body of the monster and its dungeon surroundings and equipment; these are things that the AI itself can't change arbitrarily.

(Note, archers are easier when you just presume they have an infinite supply of arrows; then you don't have to keep track of this information. Of course, then players learn to "farm" archers for an infinite number of arrows...).

Finally, such things as "morale" and "too-close-to-player" and "too-far-from-player" are considered to be extrinsic as long as the monster AI doesn't change these values.

You can get some interesting behavior in stateless AI's by creatively using extrinsic information. But state is fairly easy to add, and with a state machine you can create almost any degree of complex behavior in monsters.

There are two basic places to introduce state to your monster decision code: I call them "learned information" and "tactical state". Learned information is just that: something the monster has learned in the course of its brief life and dungeon encounters. Tactical state is high-level "what was I doing" or "what is my current goal" information that is used to choose between several different stateless AI's depending on the current plan.

As an example of learned information, consider an AI for a goblin which sets a memory variable if that goblin ever sees the player kill another goblin or stronger monster with a single attack, or sees the player kill more than four goblins in less than eight rounds. Before the goblin sees that, it will treat the player as just another @-hooman, which goblins regard as good sources of food. But after it sees that, it will regard the player as a threat to be avoided, attacked at range, trapped, or ganged up on. The goblin AI need never check this variable explicitly; instead, it just calls a routine named "observe" every round which updates information that its other routines use, and then it's AI, which otherwise looks just like a stateless AI, is calling functions like "player-is-too-powerful" which refer to things that "observe" has recorded.

You can implement "player-is-too-powerful" for a stateless AI too; in that case it will probably read the character's level, weapon damage, or kill list in order to make its assessment. But this is information the goblin shouldn't have access to; giving its AI recourse to it seems to make the critter omniscient in some ways. Of course, this omniscience is no stranger than that of monsters who don't have line-of-sight or any particular detection abilities bunching up on the other side of a wall trying to get closer to the player character, but the omnscient goblin presents the player no way to influence it.

What you want is for the player to learn something about cowardly monsters who attack in packs; if you kill a few of them, quickly, where all the others can see, then the others will run away or change tactics. Now knowing this, the player has a choice to make about how he wants to manage his goblin encounter. He can switch to a less-damaging weapon, pause a lot in combat, fight a running battle while giving ground, and (slowly) slaughter the entire goblin tribe without them twigging to the fact that he's toying with them; or he can just slice a few of them in half with his artifact sword and let the rest run away.

If the goblins are omniscient in their player-is-too-powerful assessment, the player doesn't get this tactical decision to make, and the gameplay gets one element less interesting.

As an example of tactical planning state, consider a dragon who is encountered asleep. His tactical state is "SLEEPING," and that means that the only things he can do in a given round are "sleep" and "wake up". Now, the instant he wakes up, his tactical state becomes "enraged" if there's treasure missing, "hungry" or "curious" otherwise. Depending on which state he enters, he will use different decision code in subsequent rounds.

You'd write this part of the dragon AI as something like this:


       DRAGON AI (partial)

  State:  Obs: Action:          input L    input M    input H   NULL

  SLEEPING L   asleep-ai        WAKING:1              HUNGRY:1
  WAKING   C   none                        ENRAGED:1  HUNGRY:1  CURIOUS:1
  ENRAGED  C   typical-ai, p1                                   WAKING:1
  HUNGRY   C   typical-ai, p2              ENRAGED:1
  CURIOUS  A   curious-ai                  ENRAGED:1  HUNGRY:1  BORED:0.1
  

The inputs I've labeled L, M, H, and NULL mean, respectively, loud noise, missing treasure, hungry, and none of the above. Each state, except for the WAKING state, is associated with "Action:", a stateless AI that chooses the action for monsters in that state. The WAKING state is associated with "none", which means that, instead of running a stateless AI and performing the resulting action the system immediately checks further inputs and shifts to a different state. Finally, the column labeled Obs: is for Observation, which defines a creature's alertness in general. The SLEEPING AI has observation L, which means it observes loud noises; most of the other states have observation C, which means they observe anything casually visible. The Curious state has observation A, which means all; it observes carefully and intentionally, as though the dragon had performed a "look" command.

Note that the typical-AI is engaged for both "enraged" and "hungry" behavior, with different sets of parameters. This provides two different sets of similar behavior, but with different details. This is where the state machine sets the information that was formerly regarded as extrinsic to the AI for the stateless AI's.

The numbers that follow each state name in the table entries are probabilities of transition. The probabilities of transition are almost all 1, which makes it nearly deterministic. We can have more than one input true at a time; For example, a sleeping monster could wake up to both hunger and loud noise, and then would he be waking, or hungry? As matters stand, we check inputs from left to right and therefore he'll be WAKING.

But look what happens when our critter is CURIOUS and gets no input. Next round, he has a 90% chance of being CURIOUS and a 10% chance of being BORED. So this is, actually, an NFSA.

pseudocode for operating the DFSA looks something like this, although I have simplified it by pretending I don't have to bother with parameter lists for the stateless AI's:

acted = 0;
while (!acted)
{
    observe(statemachine[obs][currentstate]);
    shifted = 0;
    for (inputs=FIRSTINPUT; inputs < LASTINPUT && !shifted; inputs++)
    {
       if (input_is_true(input))
       {  /* note that what's stored in the statemachine is an expression, 
             not necessarily just a number. getshiftprob substitute in 
             values from the monster's extrinsic info and solves the expr.*/
          probshift = getshiftprob 
                 (statemachine[input][currentstate].probshift);
          if random() < probshift
          {
             currentstate = statemachine[input][currentstate].state;
             shifted = 1;
          }
       }
    }
    if statemachine[action][currentstate] != NULL
    {
       do_stateless_ai(statemachine[action][currentstate]);
       acted = 1;
    }      
}

Given this code, plus a few mild complications like parameter lists for the stateless AI's, You can make arbitrarily complex state machines simply as arrays of states versus inputs, where each state additionally has an observation power and a stateless AI to pick actions. Your NFSA's for complex monsters can just be arrays, simple to store, handle, read from file, and so on. For each input you'll need additionally a way to detect that input.

It's common for roguelikes to bypass the need for a general "observe" function. For example, the dragon might have a chance to wake up, per round, if the character moves, depending on the character's stealth. This sort of thing can be reasonably effective (meaning that fixing it doesn't become a high priority) but always seems to wind up with flocks of weird special cases and inconsistencies.

It is more general to mediate things through the dungeon itself: When something moves, store noise information in the dungeon map with the current round number, and when something is listening, have it check the dungeon map for "recent" (ie, since its last action) noise through an observe routine. Basically, the idea is that any "observable" act should leave evidence in the dungeon map, and "observe" should pick up that evidence and notify the monsters of what they have seen, heard, or smelled. Observe will become complicated: there's no getting around that. But it's best, I think, to have one observe routine, or as few as you can possibly manage, because it keeps things consistent when everything that's supposed to be related uses the same routine.

If you are an OO fan, you may choose to bypass storing information in the dungeon map itself, instead favoring explicitly sending messages representing noise or observable acts from one actor to another in the dungeon. This works if you are disciplined, but it still manages to leave a lot more room for unhandled cases than the dungeon-map-as-sensorium model - and the dungeon-map-as-sensorium model, which is better, can also be implemented in OO.

If you don't do either of these approaches, having monsters respond to dungeon events becomes a huge mass of exception-based code with quirky holes and problems. For example, in games where the chance for a sleeping monster to wake up is based on player movement and player stealth, you can have fireballs from traps go off next to the sleeping monster's head, or a herd of rhinoceri wearing plate barding tramp through the room, and the monster will go on snoozing peacefully. And if, in the process of making all your sleeping monsters wake-up calls a bit more reliable, you overlook other behaviors that are (or should be) noise sensitive, you will develop monsters that hear more when they're asleep than when they're awake. Many examples of such idiosyncratic behavior can be found in existing games.

Oh, speaking of existing games, remember what I mentioned about angband's monsters being _mostly_ stateless? In fact they have three states. Angband's monster AI, expressed in these terms, may look roughly like this:

State:      obs.   Action:        input-P       input-D              NULL

SLEEPING:    C    asleep-ai       ATTACKING:1   ATTACKING:1           
ATTACKING:   A    typical-ai(p1)                RETREATING:dt/maxhp
RETREATING:  A    typical-ai(p2)                BERSERKING:0.05       ATTACKING:1
BERSERKING   A    typical-ai(p1)                RETREATING:dt/(4*maxhp)

input-P means "player has been detected".
input-D means "damaged".
dt is damage taken;
maxhp is maximum hitpoints. 

This allows angband monsters to decide to go on the attack again even while badly damaged, which the typical-AI didn't do. These monsters start out asleep, then go on the attack when they wake up. Each round they are attacking, they have a chance to start retreating that depends on their damage taken and max hp. Once they retreat, they have a chance to start attacking again, only this time it's berserking, with a much lower chance of retreating. They alternate between berserking and retreating for as long as they're damaged. If one becomes undamaged, it will revert to ordinary attacking.

Observation is very rudimentary in Angband; the "observe" routine (or whatever angband uses for its functionality) checks the amount of damage the monster has taken, checks for lineofsight to the character, and, I think, that's it. I don't think these monsters are even aware of individual attacks, just of how badly damaged they are at that moment.

typical-ai(P1) uses a "too close to player" function that is never true and 
                    a "too far from player" function that is never false.
typical-ai(P2) uses a "too close to player" function that is never false and 
                    a "too far from player" function that is never true.


Now, this looks simplistic, certainly; but remember that a lot of stuff is folded into typical-ai. And if typical-ai looked simplistic, remember that ten times that amount of stuff is folded into the various behaviors and checks it calls.

Anyway, this is my general architecture for monster AI:

behaviors and functions are implemented directly in the source language.

Stateless AI's are implemented in terms of a repertoire of interchangeable behaviors and functions. The Stateless AI serves as a "pattern" that brings these things together, but different kinds of behavior can fill in its specific elements.

and State machine AI's are implemented in terms of stateless AI's with parameter lists, and state transition tables. In the same way that the interchangeable behaviors and functions fit into the stateless AIs, the stateless AI's fit, as interchangeable building blocks, into the pattern formed by the state machine.

State transition tables are implemented in terms of the above routine for doing transitions and calling stateless AI's, and "observe", which runs over the dungeon map looking for evidence of specific observed events and changes intrinsic information that the behaviors and functions of a stateful AI can access.

I want to make one point about the "observe" function. You can spend as much time as you like writing it. Even a very simplistic "observe" such as that in Angband is functional. But there is always and forever stuff you can add to it and more patterns it hasn't learned. You have to pick your own point of diminishing returns here.

In the next article, before moving on to minimaxing, neural networks and genetic algorithms, I'll talk about "reactive" intelligence; how to store information about how to act on the objects that a creature interacts with.

       Ray Dillinger