Deterministic chaining: when you don't need the LLM to decide
Your deploy pipeline is four steps: lint, test, build, deploy. The first three either pass or they don't — there's no judgment in them. But your agent treats all four the same way. It stops and thinks before every one, including the three that were never decisions.
That's wasted work, and it's the most wasteful kind: the model spending its expensive attention on steps that have a known, computable outcome.
The round-trip problem
Walk through what happens without chaining. The model starts the workflow. It gets back the lint step. It reads the response. It reasons about it. It picks the transition. It submits. It waits for lint to finish. It gets the result. It reasons again. It picks the next transition. It submits. It waits for tests. And on.
Every one of those is a full LLM round trip, and each round trip costs you four things:
- Input tokens to re-read the workflow state
- Output tokens to "reason" about a choice that isn't one
- Network latency, end to end
- API cost that scales with how long the conversation has grown
Put rough numbers on it. Take a 10-step pipeline where 8 of the steps are computable. Without chaining, that's 8 round trips that exist only to produce the answer "yes, proceed." Each one re-sends the workflow state the model has to read, and spends output tokens — the expensive kind, 3 to 5 times the price of input — reasoning about a step with one possible outcome. Eight times. Per run. Every run. The pipeline that should cost one model interaction costs nine.
Tag it and forget it
The fix is one field on a transition:
actor: deterministic.
lint:
transitions:
run_lint:
target: test
actor: deterministic
executor: { kind: cli, command: lint-check }
test:
transitions:
run_tests:
target: build
actor: deterministic
executor: { kind: cli, command: test-runner }
build:
transitions:
build_artifact:
target: ready_to_deploy
actor: deterministic
ready_to_deploy:
goal: Confirm deployment
transitions:
deploy:
target: deployed
actor: agent
When a state has only deterministic transitions, the
runtime chains through them on its own. The model calls
workflow.start. The runtime sees that the first
state's only move is deterministic, so it executes lint.
Succeeds. Moves to test. Executes. Succeeds. Moves to build.
Executes. Succeeds. It arrives at ready_to_deploy —
which has an agent transition — and the chain
stops.
Three executor calls. Zero LLM round trips. One response back to the model, at the first point where there's an actual decision to make.
What the model gets back
When the chain hands control back, the response carries everything the model needs to make the deploy decision well:
- A
chainarray tracing each auto-executed step and its result - A
guidanceobject — thegoaland instructions for the current state - The links for the legal transitions — here, just
deploy
The model reads the lint output, the test count, the build
artifact — it lands at the decision with full context. And it
spent nothing getting there. The intermediate steps never even
appeared in its list of options; deterministic transitions are
hidden from the links array.
Work that happens the moment a state is entered
There's a companion to deterministic transitions worth knowing:
the onEnter action. A state can declare work that
runs automatically as soon as the workflow arrives there — and
stash the result into context for later guards to read.
The pattern is "as soon as you reach this state, run the analysis." A risk-review state, on entry, runs the risk analysis and writes its score into context; the transitions out of that state then have a real number to guard on — remediate if the score is too high, request approval if it's acceptable. The model didn't have to call the analysis and didn't have to decide to. Arriving at the state was enough to make the facts exist.
When a step breaks mid-chain
If the test step fails, the chain stops at the failure — and the model doesn't have to start over. The response includes:
- The partial chain trace — lint succeeded, tests failed
- The error from the failed executor
- A recovery link to retry just the step that broke
The model sees what worked, what didn't, and has a link to try again from the failure point. A broken step in the middle of an automated chain is still a response with a way forward — the same recovery property every other call in mcp-flowgate has.
Where a chain stops
A chain isn't a loop that runs until it's tired. It stops at exactly four things, and it's worth knowing all four:
- A decision point — any state with a non-deterministic transition
- A terminal state
- The depth limit —
maxChainDepth, default 50 - An executor failure
The depth limit is the safety net. If a config accidentally wires a cycle of deterministic transitions — it shouldn't, but mistakes happen — the chain stops at the limit instead of running forever. You can set it per workflow.
Chains that branch on real outcomes
Here's where it gets more capable than "run these steps in order." Sometimes the next step depends on what the last one returned — and that's still not a judgment call. "Run the tests; if they pass, go green; if they fail, go red" has a computable answer. The destination just depends on the result.
A transition can declare branches. After the
executor runs and its output is mapped into context, the
runtime picks the destination — the first branch whose
when guard passes wins:
run_tests:
target: red # default fallback
executor:
kind: cli
connection: shell
args: ["-c", "cargo test"]
treatNonZeroAsFailure: false # exit code is data, not failure
output:
passed: "$.output.success"
branches:
- when: { kind: expr, expr: "$.context.passed == true" }
target: green
- when: { kind: expr, expr: "$.context.passed == false" }
target: red
Two details make this work. treatNonZeroAsFailure:
false tells the CLI executor to capture a non-zero exit
code as data — output.success: false —
instead of erroring the transition. And the branches
pick the path from that data. A test run that fails is no
longer an exception; it's an outcome the chain routes on.
That's the mechanism behind the tdd example, which
enforces a real red → green → refactor cycle — the chain runs
the tests and routes itself, no model involved, right up until
there's something genuine to decide.
Chains of chains: composing whole workflows
A deterministic step doesn't have to be a single command. With
the workflow executor kind, a transition can run an
entire sub-workflow as one step:
acquire_lock:
transitions:
start:
target: deploying
executor:
kind: workflow # run an entire sub-workflow as one step
definitionId: with_artifact_lock
input: { artifact: "$.context.artifact_name" }
The runtime starts the named sub-workflow, polls it to a
terminal state, and returns its result — all as a single step
in the parent. That makes a real pattern composable: acquire a
lock (a sub-workflow), deploy, release the lock (another
sub-workflow), each stage its own governed flow with its own
guards and audit trail. A sub-workflow can itself use a
workflow executor, so to keep that from recursing
forever there's a hard depth cap of 10.
Deterministic chaining, then, isn't only "run these commands in a line." It's "run everything that's computable" — and "computable" can be a whole workflow.
Automated, not invisible
Auto-executed doesn't mean unrecorded. A chain leaves a
complete trail: each deterministic step emits a
chain.step event, a finished chain emits
chain.completed, a failed one emits
chain.failed, and every branch decision emits a
transition.branched event naming the matched
branch and the chosen target.
There's a second kind of trace, too. A successful executor can
record evidence — the CLI executor logs a cli_output
record on every successful command. So the steps a chain ran
without waking the model still leave behind facts that a later
guard, in the governed part of the workflow, can require.
The auto-run lint and test steps don't just pass — they leave
proof that they passed. You get the speed of automation with no
audit gap and no honesty gap.
When to reach for it — and when not to
Any step whose outcome is computable rather than a judgment call is a candidate: linting, testing, building, data validation, format conversion, sending a notification. If a human wouldn't need to stop and think about it, the model doesn't either.
The honest flip side: chaining is for computable steps only. If
a step needs judgment — even a little — it belongs to an
agent transition, and it should stay there. Tagging
something deterministic to shave a round trip off a
step that actually needs a decision doesn't save you anything.
It just gets you a confident wrong answer faster.
The model's attention is the expensive part of your system. Deterministic chaining is the discipline of not spending it on steps that were never decisions — and saving it for the ones that are.
The chaining guide has the full YAML reference, and the deploy-pipeline example is a complete, runnable workflow.