← Back to blog

Where MCP security actually lives — and where it doesn't yet

MCP hands a model a list of tools and trusts it to behave. Most of the security conversation around MCP is about the protocol — token scopes, server authentication, who can connect. All worth having. But there's a layer the protocol has no opinion about, and it's where the controls you actually need either live or simply don't exist.

This post is about that layer. What a gateway can enforce — and, just as important, an honest account of what it can't enforce for you yet.

The controls that don't fit in a tool definition

A tool definition can describe a tool: here's the name, here are the arguments, here's the schema. It cannot say any of the things you actually want said about a dangerous action:

  • only after the tests have passed
  • never without a human in the loop
  • at most three times, then stop
  • and write down every attempt, including the refused ones

Those aren't properties of a tool. They're policy. And policy needs somewhere to live that every call has to pass through. That's the gateway: a single checkpoint between the model and the tools, where the rules are enforced instead of hoped for.

What the gateway enforces before the executor runs

Three checks happen between a call arriving and a tool actually running. Each one can stop the call cold.

Schema validation. A transition's inputSchema is JSON Schema, and it runs before the executor. Malformed input, a missing required field, a value outside an allowed enum — rejected with INPUT_SCHEMA_VIOLATION, and the tool never sees it. That's a real boundary: bad input stops at the gate, not inside your code.

Guards. Preconditions evaluated before the executor — permission, role, expr, evidence. If a guard fails the call is rejected with GUARD_REJECTED and the executor never runs.

Actor enforcement. A transition tagged actor: human requires a human principal. An agent or anonymous caller is rejected with ACTOR_MISMATCH before anything executes. The model cannot walk itself through a human-only step — there is no "agree to my own approval" path.

Every one of these is deny-at-a-checkpoint, not hope-the-model-behaves. The call has to get past the gate, and the gate runs whether or not the model expected it to.

Gates anchored to what actually happened

The evidence guard deserves a closer look, because it's a different kind of control. Most checks test a value. An evidence guard tests that something occurred.

When an executor succeeds, it can record evidence — a named kind. A guard on a later transition can require that evidence to exist before the transition is legal. So "this deploy can't run until a security scan recorded a clean result" isn't a flag the model could set on its way past. It's a requirement that a scan step actually ran and actually produced that record on this workflow. And in its counting form, an evidence guard can demand two independent approval records — a real two-person rule. Controls anchored to events, not to mutable state, are the ones a clever caller can't talk its way around.

Approvals the model can't resolve itself

Actor enforcement stops the model from taking a human-only step. The natural next question: who takes it, and how — and can the model reach that path?

It can't, by construction. The human executor records a human.approval.requested event to a named queue and stops. Resolution happens out of band, on a channel the model isn't on: the built-in mcp-flowgate approvals command lists and resolves pending approvals from a terminal, and because every approval request is an audit event, you can just as easily tail the audit stream into Slack or a Linear issue so approvals land where people already work.

The security property is the separation. The model emits a request and halts. A human, on a different channel, resolves it. There is no API call in the model's seven tools that approves its own pending action — the request path and the resolution path are not the same path.

Retries that can't double-charge

Here's a safety property that's easy to miss. The gateway retries failed calls and falls back to alternates — necessary, because networks fail. But a retried side-effecting call is dangerous on its own: a payment that timed out but actually went through, retried, charges the customer twice.

Declaring an idempotency key closes that gap. The runtime computes one key per submit and reuses it across every retry and every fallback executor — surfaced to a REST tool as an Idempotency-Key header, to a CLI tool as an environment variable. The expense-approval example uses exactly this for its payment step. "Retry safely" becomes a line of config, and a double-execution becomes structurally hard instead of something you hope the downstream API deduplicates for you.

What the gateway records

The most underrated security feature is the one that doesn't block anything: the audit trail.

Every meaningful step emits a structured audit event — around eighteen event types, covering workflow starts, transitions, guard evaluations, executor attempts, approval requests, and rejections. Crucially, that includes the calls that failed. A transition.rejected event is written even when the model hit a wall and quietly recovered:

audit event
{
  "id":            "evt_e0b9...",
  "timestamp":     "2026-05-10T18:42:01Z",
  "workflowId":    "wf_3f8b...",
  "correlationId": "cor_9c12...",
  "actor":         "tester",
  "eventType":     "transition.rejected",
  "payload": { "transition": "approve", "code": "ACTOR_MISMATCH" }
}

Each event carries a correlationId that ties together everything from a single call, so you can reconstruct exactly what happened and in what order. This is the difference between "we think the agent didn't do that" and "here is every call it made, every gate it hit, and every refusal." Incident response needs the second answer. An audit trail you didn't have to build yourself is worth more than it looks.

One honest caveat: the default audit sink is stdout, and a none sink exists that drops everything. For any deployment where the trail matters, use the file sink and set up rotation. Audit is only ground truth if you keep it.

Now the honest part: what isn't enforced out of the box

Here's where most tools would stop, having listed their features. This is the section that matters most.

The bundled FlowgateServer — the binary you run from the quick start — treats every caller as an anonymous principal. Read that carefully, because it has a sharp consequence: the permission and role guards, the identity-based ones, cannot tell one caller from another. An anonymous principal holds no permissions and no roles, so on the bundled binary those guards can't express "Alice may deploy, Bob may not." There is no Alice and no Bob — there's one anonymous caller.

For a single developer on a laptop, that's the correct default, not a flaw — the OS user is the principal, and the guards you actually want there work without identity: inputSchema, evidence, expr, and actor: human all enforce perfectly well with no identity wiring at all.

But it does mean this: if you deploy the bundled server to a setting where different humans share it, and you lean on permission or role guards for "who can do what," those guards are not doing the job you think they are. Making them real requires wiring identity yourself — a custom ServerHandler that sources a verified Principal from the transport: a checked JWT, an mTLS subject, a mutually authenticated session. The project documents four concrete patterns for this. It's a known seam, not a hidden one — but it is a seam, and it's on you to wire it.

Wiring identity once, not everywhere

"Wire identity yourself" sounds like a chore you'd repeat at every layer. It isn't — and the recommended shape is worth knowing, because it's also the cleaner design.

Identity terminates at one layer. In a stacked deployment that's the enterprise gateway: it does the SSO, it verifies or mints the token, it establishes who the caller is. Every layer below it — team, project, local — receives the already-verified principal through a standard header and simply trusts it, because the layer above did the work. The lower layers stay identity-agnostic. You do the hard part once, at the boundary, and propagate the result inward.

And for deployments that cross a trust boundary — different organizations, callers you don't control — the guidance is explicit: put an identity proxy, something like Envoy or OAuth2-proxy, in front. Same-trust-boundary, single-user use is production-ready today. High availability is supported through the Postgres store behind a load balancer. Throughput under real production load, though, hasn't been measured yet — only the gateway's own per-call overhead, in microbenchmarks. This is a pre-1.0 project, and that list of "measured" versus "not yet" is part of the honest picture.

The trap: identity the model asserts

When you do wire identity, there's one shortcut to refuse outright: never let the caller tell you who it is. A field in the request where the model says "I'm acting as the admin" is not authentication — it's a suggestion, and a model can be talked into making any suggestion.

Identity has to come from the transport and be verified before it ever reaches the runtime — a signature you checked, a certificate subject, a session you established. The model is the thing being governed. It does not get to name its own principal. The project's embedding guide calls this out specifically, because it's the easy mistake.

So where does MCP security actually live?

It lives in a layer the protocol doesn't define — the gateway. mcp-flowgate gives you that layer with real teeth for the controls that don't need identity: schema validation, evidence and expression guards, actor enforcement, idempotent retries, and a complete audit trail. For the controls that do need identity, it gives you a documented seam — a verified Principal wired through a custom server — and it does not pretend the anonymous-by-default binary is a multi-tenant security boundary.

That last part is the point. A gate that claims to check identity but doesn't is worse than no gate, because you'll trust it. The most useful thing a security layer can do is be exact about its own edges: here is what is enforced today, here is precisely what you wire to enforce the rest, and here is the audit trail that proves what happened either way.

This is a pre-1.0 project, and that honesty is the feature. Read it, wire the seam your deployment needs, and keep the trail.

The security policy and reporting process are in SECURITY.md; identity wiring for multi-tenant deployments is covered in the MCP control architecture guide and the embedding guide.