← Back to blog

Why your AI agent needs a state machine

Your agent approved something before it was reviewed. So you tightened the prompt — "always review first." A week later it deployed before the tests ran. You added another sentence. Then it happened a third time, in a way your sentences hadn't predicted.

At some point you have to stop blaming the prompt. The prompt isn't the problem. The problem is that your agent has no idea where it is.

An agent with no sense of place

A model call doesn't have a location. The agent's only sense of "where things stand" is a growing pile of conversation — and nothing in that pile authoritatively answers the two questions that actually matter: what step are we on? and what is allowed right now?

So the agent infers. It reads back over the transcript, forms an impression of how far along it is, and decides what to do next from that impression. Inference like that works most of the time, which is the trap — it works often enough that you ship it, and drifts often enough that it hurts. An agent guessing its own position will eventually guess wrong, and a wrong guess about "what's allowed now" is the bug you keep patching with prose.

The unglamorous fix

There's a pattern built for exactly this, and it is not exciting. A state machine: a finite set of named states, and transitions between them, where each transition is legal only from certain states. You drew one in your first computing course. The idea is roughly 70 years old. It is the opposite of cutting-edge.

That's not a strike against it. That's the reason to use it.

A state machine is something you can draw on a whiteboard, read top to bottom, diff in a pull request, and check for mistakes before it ever runs. It holds no surprises. And when the thing you're trying to govern is a probabilistic model that will surprise you, the one place you want zero surprises is the structure doing the governing. An exciting abstraction you can't fully reason about is worse than a boring one you can.

Every tool is a transition

This is the move mcp-flowgate makes. A workflow is a state machine, and every tool call is a transition between states.

The simplest possible workflow has one state: call any tool, end up back where you started. That shape is just a flat list of tools — and it's the degenerate case of the same idea. Internally it even compiles to a one-state machine, named proxy_default. So you're never choosing "state machine or not." You're choosing how many states. Zero extra states gives you a plain tool list. Add states when the process has an order worth enforcing. It's the same engine and the same seven tools facing the model either way — you've just told the engine more about the shape of the work.

Four things a state machine gives an agent — that a prompt can't

1. A definition of "now." The workflow has a current state, and it's a fact, not an impression. The agent doesn't reconstruct where it is from conversation history — it asks, and the answer is authoritative. "Now" becomes data.

2. A definition of "legal." A transition is legal only from the states that declare it. The approve action isn't something the agent is discouraged from taking early — it does not exist as a move until the workflow is in the state that offers it. The set of things the agent can do collapses, at every step, to the set of things that are correct to do.

3. A recorded path. Every transition advances the state and bumps a version counter. How the workflow reached where it is isn't something you reconstruct from logs after an incident — it is the state history, recorded as it happened, with an audit event for every step. The path is a first-class artifact, not forensic guesswork.

4. A split between who authors the machine and who runs it. This is the keystone. The person who designs the state machine is not the agent that walks it. The agent submits transitions; it cannot add one. It cannot redraw the map. Whoever wrote the workflow decided the shape of what's possible, and the agent only moves inside that shape. A prompt can never give you this — a prompt is a suggestion handed to the same entity that's free to reinterpret it.

What that looks like

Here's a governed content-publishing workflow — trimmed to its skeleton:

content-publish.yaml
workflows:
  content_publish:
    initialState: idea
    states:
      idea:
        transitions:
          create_outline: { target: outlined }
      outlined:
        transitions:
          write_draft: { target: drafted }
      drafted:
        transitions:
          run_brand_review: { target: brand_reviewed }
      brand_reviewed:
        transitions:
          request_approval: { target: awaiting_approval }
      awaiting_approval:
        transitions:
          approve: { target: published, actor: human }
      published:
        terminal: true

Look at what the agent can't do here. From drafted, the only transition is run_brand_review. There is no approve reachable from drafted, and no publish transition anywhere except as the move into published — which is only declared out of awaiting_approval, and is tagged actor: human.

So an agent holding a finished draft cannot publish it. Not "shouldn't" — cannot. To get content live it has to walk every state, and one of those states hands control to a person. The structure makes the wrong thing unreachable, not merely discouraged.

Compare that to a flat tool list, where publish is just a tool — always callable, always one decision away — and "don't publish before review" is a sentence in a prompt the model is free to misread on a long, tired afternoon. Same model, same task. One setup makes the mistake impossible; the other makes it improbable. Improbable things happen.

Powerful for the right problem

Now the honest part, because a state machine is not a universal good. It is genuinely the wrong tool for plenty of work.

A pure transform — take some JSON, return some JSON — has no states; modeling it as a one-state machine is ceremony with no payoff. A stateless lookup doesn't need a version counter or a transition graph. If you wrap something that has no order, no rules about legality, and no need for a history, you've added structure that earns nothing back. mcp-flowgate's own docs say it plainly: if you have one tool and no governance needs, don't put a gateway in front of it.

The pattern earns its keep only when the problem has four properties: there's an order to things; some actions are legal only at certain points; you need to know how you arrived; and the rules are set by someone other than the actor.

And there's the whole argument, because look at that list again. An AI agent doing real work moves through a process. Only some actions are safe at each point. You need the trail when something goes wrong. And the agent must not be the one deciding what it's allowed to do. Agent governance doesn't just happen to fit a state machine — it has all four properties, exactly. It isn't a problem a state machine can model. It's the problem a state machine is for.

The pattern is 70 years old and the problem is brand new, and they fit because the shape was the same all along. We just hadn't met this version of the problem yet.

Reach for the boring tool

The exciting part of an AI agent is the model. The part that makes it something you'd let near production is the least exciting pattern in the textbook — finite states, explicit transitions, a current position you can name.

So the next time your agent does something out of order, don't reach for a longer prompt. You've tried that; prompts don't give an agent a sense of place. Reach for the boring tool. This is the problem it was waiting for.

The mental model — "every tool is a transition" — is laid out in the docs, and the workflows guide walks through building one state by state.