Learning

Agent-native gateway

June 7, 2026

You open Codex or Claude Code and ask it to help with a real task:

“Check the customer ticket, read the related service logs, update the runbook, and open a pull request.”

The agent can write code. It can reason through a messy incident. It can call tools. Then it hits the enterprise wall.

The ticket system needs SSO. The logs sit behind a private network. The internal API has its own token format. The MCP server on someone’s laptop has no audit trail. The SaaS app wants OAuth consent. The security team does not want API keys pasted into prompts, shell profiles, or local config files.

That wall is not an agent problem. It is an access architecture problem.

An agent-native gateway is the missing enterprise layer between coding agents and the systems they need to use. It gives agents one governed front door for internal APIs, MCP servers, and third-party applications. The gateway teaches the agent how to authenticate, gives the agent its own identity, checks what the user is allowed to do, narrows what the agent is allowed to do, and only then proxies the request to the real system.

Animated diagram showing a coding agent calling private APIs, MCP tools, and SaaS apps through one governed gateway.
The gateway turns a pile of separate auth puzzles into one governed path. It centralizes policy at the place every agent call must pass through.

The core rule is simple:

An agent should be able to act for a person, but never with more access than that person has, and never with more access than the agent was granted.

In plain language:

effective access = user access AND agent grant AND route policy

If Mira can read support tickets, and her coding agent has been granted ticket.read, and the route allows read calls, the request is allowed. If any one of those checks fails, the request is denied. That one rule is the difference between “the agent has my token” and “the agent is a controlled actor in our company.”

Why Companies Need This

Enterprises already know how to secure people. They have SSO, groups, roles, device posture, offboarding, access reviews, audit logs, and incident runbooks.

Agents arrive sideways. They carry user context, behave like software clients, and sometimes run like automation. A coding agent can run from a developer’s machine, an IDE, a hosted agent runtime, a CI job, or a chat product. It may need to call ten systems during one task. Some calls are harmless reads. Some change production state.

Without a gateway, every team invents its own bridge:

BridgeWhat goes wrong
Static API keysThey spread through local files, prompts, CI logs, and screenshots. They are hard to scope, rotate, and attribute.
User tokens copied into toolsThe agent becomes the user. You lose the ability to say what the agent did separately from what the person did.
Service accountsThe human context disappears. The backend cannot tell which employee the agent was helping.
Local MCP serversThey are useful for experiments, but production access, audit, and revocation become inconsistent.
Per-API OAuth projectsEvery backend team has to become an identity team. Most will get some part wrong.

The production requirement is not “let the agent in.” The production requirement is:

  1. The agent can discover how to connect without a ticket.
  2. The human logs in with normal company SSO.
  3. The agent gets its own identity.
  4. Access is scoped to a tool, endpoint, resource, and action.
  5. The backend stays private.
  6. Every call is logged as “agent X acting for user Y.”
  7. Revocation works when the user leaves, the agent is disabled, or a grant is removed.

That is what an agent-native gateway is for.

Vocabulary First

The words are not hard, but they come from different worlds. It helps to put them in one place.

TermPlain meaning
AgentSoftware that can plan and call tools. Codex, Claude Code, an IDE agent, or a custom worker can all be agents.
MCPModel Context Protocol. A standard way for agents to discover and call tools exposed by a server.
APIA service endpoint. It might be REST, gRPC, GraphQL, or a private internal HTTP service.
ToolOne operation an agent can invoke, such as ticket.read, repo.search, or invoice.create. In MCP, tools are first-class.
Corporate IdPThe identity provider employees already use for login, such as Microsoft Entra ID, Okta, or Google Workspace.
Authorization serverThe system that issues tokens. In the stack below, this is usually Keycloak.
TokenA short-lived credential the client sends with a request. A token should say who it represents, who issued it, what it is for, and when it expires.
ScopeA named permission such as tickets:read. Scopes are coarse gates, not the whole authorization model.
AudienceThe service a token is meant for. A token for the gateway should not be accepted by a backend API.
Client ID Metadata DocumentAn OAuth client can use an HTTPS URL as its client_id. That URL points to JSON metadata such as client name, redirect URIs, and key information. Current MCP authorization treats this as an important no-prior-relationship path.
DCRDynamic Client Registration. A way for an agent to register itself as an OAuth client without a manual admin ticket. In current MCP authorization, it is useful but should be treated as a fallback or compatibility path, not the only path.
PRMProtected Resource Metadata. A .well-known JSON document that tells clients which authorization server protects a resource.
Resource parameterAn OAuth request parameter that names the exact protected resource the client wants a token for. For MCP, this binds the token request to the MCP server instead of producing a vague token.
auth.mdA human-readable and agent-readable Markdown file that explains how agents register and authenticate for a service.
GatewayThe front door that validates tokens, checks policy, rate limits, logs, and proxies to private systems.
FGAFine-grained authorization. A policy engine answers “can this subject perform this action on this object?”
Token exchangeSwapping one token for a narrower token. The gateway uses this so broad agent tokens do not reach backends.
Audit eventA durable record of who called what, through which agent, for which user, with what decision.

The important distinction is between a human, an agent, and a backend.

The human is the employee. The agent is the actor helping the employee. The backend is the system being used. A production gateway keeps those identities separate instead of flattening them into one bearer token.

What Makes The Gateway Agent-Native

A normal API gateway expects clients to be configured by humans. Someone creates an OAuth app, copies a client ID, adds redirect URLs, picks scopes, writes docs, and opens a ticket when something changes.

That process breaks down for agents. Agents need to learn at runtime. They may start with only a URL. They need to ask, “how do I authenticate for this resource?” and get a machine-readable answer.

The agent-native path looks like this:

  1. The agent calls a protected URL without a token.
  2. The gateway returns 401 Unauthorized with a WWW-Authenticate header that points to protected resource metadata.
  3. The agent reads /.well-known/oauth-protected-resource.
  4. The agent follows that metadata to the authorization server metadata.
  5. The client uses pre-registered client information, a Client ID Metadata Document, or Dynamic Client Registration if the server supports it.
  6. For MCP, the client includes the resource parameter so the token is bound to the intended MCP server.
  7. The user signs in once with company SSO.
  8. The agent receives a short-lived token that names both the user and the agent.
  9. The agent calls the gateway with that token.
Animated sequence showing 401 challenge, metadata discovery, client identity, SSO login, and a resource-bound token.
Discovery makes the URL self-describing. A capable client can move from "I have no token" to "I know the login flow" without a human copying setup instructions into the agent.

The standards underneath this are ordinary OAuth and OIDC pieces composed for agent use: Protected Resource Metadata (RFC 9728), Authorization Server Metadata (RFC 8414), OAuth Client ID Metadata Documents, Dynamic Client Registration (RFC 7591), OAuth with PKCE, Resource Indicators (RFC 8707), and Token Exchange (RFC 8693). The current MCP authorization specification expects protected resource discovery, client registration options, and explicit resource binding for protected MCP servers.

auth.md is the prose companion. The protected resource metadata is the machine source of truth. auth.md is useful because humans and agents can both read it. It can explain supported flows, scopes, client metadata options, registration endpoints, revocation behavior, and contact points. The auth.md project is a good reference for the shape of that file.

The Access Rule

Least privilege for agents needs two independent checks.

The user check asks: “Is this employee allowed to do this?”

The agent check asks: “Has this agent been granted this operation?”

The route or tool policy asks: “What does this endpoint require?”

Diagram showing user rights, agent grant, route policy, and an allow decision when all three checks pass.
A user can have broad access while an agent has narrow access. The gateway should require both to pass. A missing user entitlement, missing agent grant, or missing route permission should all produce a denial.

This matters because employees often have more access than an agent should use. A support engineer may be allowed to edit tickets, manage escalations, export reports, and view customer history. A coding agent helping with a bug might only need to read the ticket and search logs.

The gateway should not ask, “can Mira do everything this token allows?” It should ask a more exact question:

Can agent:docs-assistant, acting for user:mira, perform ticket.read on ticket:T-391?

That shape lets the company revoke the agent without disabling Mira. It also lets the company keep Mira’s access while denying the agent one sensitive tool.

The Production Stack

The reference architecture I would use today is open source at the control plane and portable across clouds. The cloud provider still supplies substrate: compute, networking, load balancers, managed databases if you choose them, and private connectivity. The agent access system itself should not depend on one vendor’s private agent product.

Production stack diagram showing identity, policy, secrets, audit, gateway, private APIs, MCP servers, and SaaS apps.
The gateway is only the hot path. A production system also needs identity, fine-grained policy, secrets, private connectivity, observability, and a lifecycle for grants and revocation.

Here is the stack I would start with, with the usual caveat: choose the gateway after a proof of concept against your real MCP, REST, and SaaS traffic. The right default is the one that can enforce your policies on the operations your agents actually call.

PlaneStarting open-source choiceWhy
Agent-aware gatewayagentgatewayIt is built for service, LLM, MCP, and agent-to-agent traffic in one data plane. It is the first tool I would evaluate when those protocols must share one governed edge.
Authorization serverKeycloakMature OIDC/OAuth server with identity brokering, client registration, admin UI, sessions, token issuance, and standard token exchange support.
Fine-grained authorizationOpenFGAFriendly relationship-based authorization model for subject, relation, object checks.
High-scale authorization alternativeSpiceDBZanzibar-inspired permission database with strong production maturity. Use it when scale, consistency, and ecosystem fit point that way.
Request policyCEL, OPA, or gateway-native policyGood for request attributes: method, path, tool name, tenant, IP range, risk score, and approval state.
SecretsOpenBaoStores signing keys, client secrets, connector credentials, and SaaS refresh tokens.
ObservabilityOpenTelemetry, Prometheus, Grafana, Loki or TempoTraces, metrics, logs, and audit trails should use open formats.
Infrastructure as codeOpenTofuKeeps routes, policies, clusters, databases, and network settings reviewable.
RuntimeKubernetes, ECS, or another container platformThe gateway must run close to private services and support horizontal scale.

Other stacks can work. If all you need is REST, Envoy Gateway plus JWT validation and external authorization can be enough. If you want an MCP registry, federation, and admin UI, ContextForge is worth evaluating. If your company already has a strong OPA platform, use it for the checks it is good at. Evaluate maturity the boring way: protocol support, policy expressiveness, failure modes, audit shape, performance, upgrade path, and whether your team can operate it. The architecture matters more than the brand name: keep responsibilities separated.

The gateway validates and routes. The authorization server issues and revokes tokens. The FGA engine answers exact permission questions. The secret store keeps tokens and keys out of code. The network keeps backends private. The audit system makes every decision explainable later.

Identity Plane: Who Is The User, And Who Is The Agent?

The identity plane has two jobs.

First, it brokers the human login to the corporate IdP. Employees should keep using the login system they already know. The gateway stack should not become a second employee directory. Keycloak can act as the broker: it redirects the user to the corporate IdP, receives the assertion, maps groups or claims to roles, and issues its own tokens for the gateway.

Second, it gives each agent its own identity. An agent should not borrow the user’s password. It should not share a single “ai-agent” service account with every other agent. It should be an OAuth client with its own client_id, its own allowed grant types, its own scopes, and its own lifecycle.

A good token for an agent call should answer:

ClaimMeaning
subThe human subject the agent is acting for.
client_id or azpThe OAuth client. This identifies the agent or gateway client.
actActor information when the token format supports it. This is useful for “agent X acting for user Y.”
issThe issuer. The gateway should only trust the company authorization server.
audThe audience. The token should be meant for the gateway or one specific backend, not everything.
scopeCoarse permissions requested and granted.
expExpiration. Short-lived tokens limit damage.

The identity plane should also support different operating modes:

ModeUse it whenShape
Interactive user delegationA person is using Codex, Claude Code, or an IDE agent.The user signs in once with SSO and consent. The agent refreshes until the session or grant is revoked.
Machine-to-machineA scheduled agent or CI worker acts as itself.The agent uses client_credentials, with no human subject. The grant should be narrow and heavily audited.
Attested agent identityA trusted agent provider can assert the user’s identity or device context.Useful later for headless fleets, but only after the company chooses who it trusts.
User-claimed bindingNo trusted attestation exists.The user confirms a code to bind the agent to their account. Good fallback, weaker UX.

For most enterprises, the default should be interactive delegation for human coding agents and machine-to-machine only for narrow automation.

Discovery Plane: How The Agent Learns

Discovery is what lets an agent start from a URL.

The protected resource metadata might say:

{
  "resource": "https://tools.company.example",
  "authorization_servers": [
    "https://auth.company.example/realms/agents"
  ],
  "scopes_supported": [
    "tickets:read",
    "logs:search",
    "runbooks:write"
  ],
  "bearer_methods_supported": ["header"]
}

The authorization server metadata then tells the agent where to authorize, where to fetch tokens, which PKCE methods are supported, which client identity mechanisms are available, and where the JWKS lives. JWKS is the public key set the gateway uses to verify signed tokens.

For MCP, current authorization guidance puts client registration in this order: use pre-registered client information if the client already has it, use a Client ID Metadata Document when the authorization server supports it, use Dynamic Client Registration as a fallback when supported, and prompt the user only when none of those are available. That ordering matters. It keeps DCR from becoming the only way to onboard a client and gives enterprises a cleaner path for known clients such as approved IDE agents.

MCP clients also need resource binding. The authorization request and token request should include a resource value for the MCP server, such as https://tools.company.example/mcp. That value prevents the client from getting a token that is too vague and prevents an MCP server from accepting a token minted for some other resource.

auth.md should explain the same flow in words:

# Agent authentication for tools.company.example

This resource supports OAuth with PKCE for user-delegated agents.

Discovery:
- Protected resource metadata: /.well-known/oauth-protected-resource
- Authorization server: https://auth.company.example/realms/agents

Supported scopes:
- tickets:read
- logs:search
- runbooks:write

Client identity:
- Known clients may be pre-registered.
- Other clients may use a Client ID Metadata Document.
- Dynamic Client Registration is available as fallback.

Revocation:
- Users can disconnect agents in the Access Console.
- Admins can disable an agent client or remove a fine-grained grant.

That file should not contain secrets. It should contain instructions, endpoints, supported flows, scope names, and error behavior. The agent uses it to avoid guessing.

Gateway Plane: What Happens On Every Call

Every request should pass through the same hot path:

  1. Terminate TLS at the edge.
  2. Validate the token signature against the authorization server’s JWKS.
  3. Check issuer, audience, expiration, not-before time, and required scope.
  4. Parse the route or MCP tool name.
  5. Ask the policy engine whether the user and agent may perform that action.
  6. Apply rate limits, quotas, data-loss rules, redaction, or approval requirements.
  7. Exchange the token for a backend-specific token when needed.
  8. Proxy to a private backend over mTLS or private network routes.
  9. Emit a trace, metric, and audit event.

An agent-native gateway has to understand agent traffic well enough to apply policy at the tool level. For REST, the route might be GET /tickets/:id. For MCP, the operation might be tools/call with a JSON body naming ticket.read. A path-only decision cannot safely distinguish that from ticket.delete.

The gateway also needs response-side controls. Some tools return secrets, customer data, source code, logs, or financial records. Production policy should be able to redact fields, block high-risk responses, minimize what is sent back to the agent, and mark data classifications in the audit event. Otherwise the gateway only controls entry, while sensitive data leaves through the response.

Token Exchange: Why The Backend Should Not See The Broad Token

The token an agent sends to the gateway is useful at the gateway. It may include scopes for several tools and a gateway audience. A backend should not receive that token. If it does, the backend can accidentally become a confused deputy: it holds a token that might be valid somewhere else.

Token exchange fixes that.

The gateway validates the agent token, checks policy, then asks the authorization server for a new token with a narrower audience and narrower scope. The backend receives only the token minted for that backend.

If you use Keycloak for this, test the exact token-exchange behavior you need. Keycloak’s standard token exchange is the right direction for internal token downscoping, but current Keycloak docs call out limits: public clients cannot make token-exchange requests, the resource parameter is not supported yet, and RFC 8693 delegation semantics are not fully supported. In this architecture, the gateway should be the confidential requester client and the production rollout should test the audience, scope, refresh, and revocation behavior explicitly.

Animated diagram showing a gateway token stopping at the gateway and a narrower backend token moving forward.
The broad token is useful for the gateway. The backend gets a token whose audience is the backend and whose scope is the exact action that passed policy.

This also makes backend validation simpler. The backend checks:

CheckWhy
IssuerOnly trust the company authorization server.
AudienceOnly accept tokens minted for this backend.
ExpirationReject old tokens.
ScopeConfirm the gateway sent a token for the operation being performed.
Actor headers or claimsPreserve audit context if the backend logs locally too.

The backend should still be private. Token validation is defense in depth, not a replacement for network isolation.

Policy Plane: Scopes Are Not Enough

Scopes are useful names. Enterprise authorization also needs resources, relationships, conditions, and audit.

tickets:read tells you the operation family. It does not tell you whether Mira can read ticket T-391, whether the agent is allowed to read tickets in the payments queue, whether exports require approval, or whether this action is blocked outside a production incident.

Fine-grained authorization engines model relationships:

user:mira        member     group:support-readers
group:support-readers viewer ticket:T-391
agent:docs-assistant allowed tool:ticket.read

At request time, the gateway asks:

Can user:mira read ticket:T-391?
Can agent:docs-assistant invoke tool:ticket.read?
Does this route require ticket.read?

OpenFGA is a good default because its model is easy to explain and its API is straightforward. SpiceDB is a strong alternative when you want a very mature, high-scale Zanzibar-style system. OPA or CEL-style policy is useful for request attributes that are not naturally relationship tuples: method, path, time, tenant, IP range, model risk score, or approval status.

The clean production pattern is to use both kinds of checks:

Check typeGood for
Relationship checkUser-to-resource, agent-to-tool, group-to-role, team-to-project.
Attribute or expression checkRequest method, route, tenant, data classification, rate-limit tier, approval state.

Do not bury this policy in every backend. Put it where every agent call passes.

MCP Changes The Granularity

MCP raises the authorization granularity.

An MCP server can expose many tools behind one endpoint. One tool might search docs. Another might open a ticket. Another might delete a repository, rotate a secret, or export customer data. If the gateway only knows “this is the MCP server,” it cannot make a safe decision.

The gateway must understand the MCP call enough to extract the tool name, subject, target resource, and requested action. Then it can apply per-tool policy.

Animated diagram showing MCP tool calls receiving allow, approval, or deny decisions at the gateway.
Treat MCP tools like production operations. `search_docs`, `read_ticket`, `export_data`, and `delete_repo` should not inherit the same permission just because they live behind the same MCP endpoint.

A production MCP gateway should support:

FeatureWhy it matters
Server registryPlatform teams need to know which MCP servers exist, who owns them, and which tools they expose.
Tool catalogAgents and humans need stable tool names, descriptions, versions, and risk levels.
Per-tool authorizationSafe reads, risky writes, and admin actions need different grants.
Tool filteringAn agent should only discover tools it is allowed to call.
Tool-call auditLogs should record the exact tool, input classification, decision, user, and agent.
Approval gatesDestructive or expensive tools should require a human confirmation or change ticket.
Rate limits and budgetsTool calls can create cost, load, or data exposure. Limits should be per user, agent, team, and tool.

Local MCP servers are wonderful for development. Production MCP servers need the same controls as any other privileged interface.

Third-Party Applications

The same pattern applies to SaaS tools: GitHub, Jira, ServiceNow, Snowflake, Salesforce, Datadog, internal support systems, or anything else an agent may need.

The wrong pattern is to give the agent a long-lived SaaS token. The better pattern is to put a connector behind the gateway.

Animated diagram showing an agent calling a SaaS app through a gateway, policy check, and OpenBao token vault.
For SaaS apps, the gateway owns consent, refresh, policy, and audit. The agent asks for an action. The connector uses the right scoped credential only after policy passes.

There are two common SaaS connector patterns.

PatternUse it whenHow it works
Per-user OAuth connectorThe SaaS app supports delegated OAuth and the action should happen as the employee.The user consents once. The refresh token is stored in OpenBao. The gateway checks policy before the connector uses it.
Controlled service integrationThe SaaS app is managed through a company service account or app installation.The connector holds the service credential in OpenBao, but the gateway still records the human and agent context and applies policy before use.

The first pattern is usually better for user-owned data. The second is often needed for admin APIs, app installations, or systems with weak delegated OAuth. In both cases, the agent does not see the secret.

The connector should also normalize third-party quirks. Every SaaS app has its own scopes, rate limits, consent screens, pagination, and error codes. The agent-native gateway should make the company policy consistent even when the downstream APIs are not.

Secrets Plane

The gateway stack will handle sensitive material:

SecretWhere it belongs
Authorization server signing keysOpenBao or the authorization server’s secure key store with rotation.
Agent confidential client secretsOpenBao, never committed and never logged.
Gateway client secretOpenBao, mounted or fetched at runtime.
SaaS refresh tokensOpenBao, namespaced per user, app, and tenant.
mTLS private keysOpenBao or the service mesh certificate system.
Initial DCR tokensShort-lived and stored like other platform secrets.

OpenBao is a strong default because it is open source, supports secret engines, leases, policies, audit devices, and integrations. Cloud secret managers can be part of the substrate if your company is comfortable with that, but the architecture should not require agents or backends to carry raw secrets.

The rule is blunt: if a secret would let an agent call a system without the gateway, it does not belong in the agent.

Connectivity Plane

The backend should be unreachable except through the gateway.

For cloud-native systems, that usually means private subnets, internal load balancers, security groups or firewall rules, and mTLS between services. For on-prem systems, it means VPN or private interconnect routes. For Kubernetes, it may mean a gateway deployment inside the cluster or near the cluster, with a service mesh for east-west traffic.

The public surface should be small:

  1. The external HTTPS entry point for agent calls.
  2. The discovery endpoints.
  3. The authorization server login endpoints that must be public for SSO flows.

OpenFGA, OpenBao, databases, backends, and internal MCP servers should not be public. They are control plane and private service components, not internet products.

This network shape is part of the security model. A backend that is publicly reachable can accidentally become a second, weaker front door.

Observability And Audit

Agent calls need stronger observability than normal service calls because the same user request may cause many tool calls across many systems. When something goes wrong, “the agent did it” is too vague.

Every call should produce an audit event that a security or platform team can read without reconstructing a trace by hand.

Diagram showing a readable audit answer and a structured audit event with agent, user, tool, resource, decision, reason, and trace.
Good audit is both human-readable and structured. The human answer explains what happened. The structured event lets you search, alert, and join it to traces.

At minimum, log:

FieldExample
Agentdocs-assistant
User[email protected]
Clientcodex-cli, claude-code, or a custom agent runtime
Tool or routeticket.read or GET /tickets/:id
Resourceticket:T-391
Decisionallow, deny, approval_required, rate_limited
ReasonUser group, agent grant, route policy, approval id, or missing grant.
Token audiencetickets-api
Trace idThe distributed trace that connects gateway, policy, and backend.
Data classificationPublic, internal, confidential, regulated, if known.

OpenTelemetry should carry traces. Prometheus should carry metrics. Logs should go to a searchable store. Audit events should be retained according to company policy and protected from casual modification.

Useful alerts include deny spikes, new client registration spikes, token exchange failures, policy engine latency, unusual tool use, failed SSO login bursts, and expensive tool budgets crossing thresholds.

Access Console

REST APIs and CLIs are useful for automation, but they are awkward as the only day-to-day access console. Security, IT, service owners, and team leads need a place to inspect grants, approve changes, explain denials, and revoke access without editing tuples by hand.

There are open-source or open alternatives, but they cover different parts of the job:

OptionWhat it gives youFit
Keycloak Admin ConsoleA mature UI for clients, users, sessions, identity providers, roles, and token settings.Use it for the identity plane. It does not manage per-tool FGA grants by itself.
Topaz ConsoleA graphical console for viewing policies, using an evaluator, and managing directory objects and relationships in Topaz.A strong open-source fit when you want OPA plus Zanzibar-style directory data with a built-in UI.
ContextForge Admin UIA gateway UI for MCP servers, tools, resources, prompts, federated gateways, roots, and metrics.Good fit for the MCP registry and tool catalog layer. Pair it with identity and FGA controls.
OpenFGA PlaygroundA visual local modeling and tuple-testing UI.Useful for development. OpenFGA explicitly warns against enabling it in production and recommends CLI, VS Code, and model files for production management.
PermifyAn open-source Zanzibar-style authorization service with a playground in the repo.Worth evaluating, especially after its FusionAuth acquisition. Verify the admin UI story against your production workflows.
SpiceDB Playground plus AuthZed UISpiceDB is open source, and its Playground is open source and self-hostable for schema and relationship modeling. AuthZed has broader management UI in its hosted and dedicated products.Excellent engine. Treat the Playground as a modeling UI; treat the full management dashboard as a product choice.

For a production agent-native gateway, I would separate two UI needs. The MCP catalog UI manages servers, tools, prompts, resources, and health. The access UI manages who can use those tools, why the grant exists, when it expires, and how to revoke it.

That console should let authorized admins:

ActionWhy
See registered agentsKnow who owns each agent, where it runs, and when it last acted.
Grant a toolApprove agent X may call tool Y for group Z.
Revoke a grantRemove access without changing backend code.
Disconnect a user-agent pairEnd one user’s delegation to one agent.
Review stale grantsRemove agents or grants not used in 30, 60, or 90 days.
Explain a denialShow which check failed: user, agent, scope, route, policy, approval, or rate limit.
Export auditSupport security review and incident response.

If Topaz fits your policy model, its console gets you closer to this on day one. If you choose OpenFGA or SpiceDB, plan on building a thin enterprise Access Console over the authorization engine, Keycloak, and the audit store. Keep it small, put it behind SSO, restrict it to platform, security, and delegated service owners, and log every change it makes.

What This Enables

Once the gateway exists, a lot of enterprise agent work becomes safer and less awkward.

CapabilityWhat changes
URL-first onboardingAn agent can receive a gateway URL and discover how to authenticate.
One-time loginA human signs in once with normal company SSO instead of copying credentials into the agent.
Agent identityEvery action names the agent separately from the user.
Per-tool accessA coding agent can read tickets without being able to delete repos or export financial data.
Private API accessBackends stay on private networks while agents reach them through the governed edge.
SaaS connectorsThird-party app access uses scoped OAuth or controlled service integrations, not pasted tokens.
AuditSecurity can answer who, what, when, why, and through which agent.
RevocationDisable the user, revoke the agent, remove a grant, or kill the session.
Policy-as-codeRoutes, scopes, and grants can move through review instead of hallway permission changes.
Safer writesDestructive actions can require approval, rate limits, or change windows.
Tool catalogTeams can publish MCP tools with ownership, versioning, and risk labels.
Consistent backend onboardingNew APIs do not each implement OAuth, RBAC, audit, and token handling from scratch.

The biggest organizational change is that onboarding a new backend becomes a platform operation, not an auth project. The service team deploys its backend privately. The platform team adds a route. The service owner defines scopes and tool names. Security approves grants. The gateway enforces the same pattern every time.

Production Hardening Checklist

A demo can prove the flow. Production has to survive mistakes, scale, audits, and offboarding.

Identity And Tokens

  • Use short-lived access tokens. Fifteen minutes or less is a common starting point.
  • Use refresh-token rotation for delegated agents.
  • Revoke refresh tokens when the user disconnects the agent, leaves the company, or loses the group that granted access.
  • Use DPoP or mTLS-bound tokens where your clients and authorization server support them. Sender-constrained tokens reduce replay risk.
  • Restrict client onboarding. Do not allow unlimited anonymous registration in production. Prefer pre-registration for known clients, Client ID Metadata Documents for no-prior-relationship clients, and tightly controlled DCR as a fallback. Use attestation, domain allow-lists, rate limits, and review.
  • Keep token audiences narrow. A gateway token should not be a backend token.

Gateway

  • Run at least two replicas across zones.
  • Keep the gateway stateless.
  • Cache JWKS safely and refresh on key rotation.
  • Apply request and response size limits.
  • Add rate limits per user, agent, team, tool, and tenant.
  • Add SSRF protections for connector-style tools.
  • Treat MCP tool names as authorization inputs.
  • Deny unknown tools by default.
  • Make approval-required a first-class decision, not an exception path.

Policy

  • Keep route definitions and policy in version control.
  • Review policy changes like code.
  • Use relationship checks for user, group, agent, tool, and resource grants.
  • Use expression policy for request attributes.
  • Test allow and deny cases for every new tool.
  • Make grants expire when possible.

Secrets

  • Store client secrets, signing keys, and SaaS tokens in OpenBao or an approved secret store.
  • Rotate secrets.
  • Never log tokens.
  • Never put secrets in auth.md, MCP config, generated examples, or agent prompts.
  • Use separate credentials for non-prod and prod.

Network

  • Put backends in private networks.
  • Allow backend ingress only from the gateway or internal load balancer.
  • Use mTLS for service-to-service calls.
  • Keep OpenFGA, OpenBao, databases, and internal MCP servers private.
  • Add WAF rules and DDoS protections at the public edge.

Data And Audit

  • Classify tools by data sensitivity and action risk.
  • Apply response-side DLP, redaction, and data minimization before returning sensitive tool output to the agent.
  • Redact or hash sensitive request fields in logs.
  • Keep enough input context for incident response, but do not store raw secrets or unnecessary customer data.
  • Join gateway traces to backend traces.
  • Alert on abnormal denies, new grants, high-risk tool use, and token exchange failures.

Availability

  • Use durable databases for Keycloak and the authorization engine.
  • Run database backups and restore drills.
  • Keep infrastructure in OpenTofu or another reviewable IaC system.
  • Write runbooks for revoking an agent, revoking a user, rotating a client secret, disabling one tool, and bypassing a broken connector.

A Rollout That Does Not Scare Everyone

Do not start with an agent that can write to every production system. Start where the risk is low and the learning is high.

Diagram showing a phased rollout from observe, read-only, private APIs, MCP fleet, write actions, to SaaS apps.
A good rollout climbs by risk. Read-only access teaches the platform team how agents behave before writes, approvals, and third-party app actions enter the system.

Phase 0 is observation. Put one or two low-risk tools behind the gateway. Log everything. Do not allow writes. Learn what agents call, how often they retry, what errors they produce, and which prompts cause tool use.

Phase 1 is read-only work. Documentation search, ticket read, code search, build status, metrics lookup, and runbook retrieval are good candidates. They create clear value without letting the agent change state.

Phase 2 is one private API. Pick a service with a clear owner, simple resource model, and willing users. Make the backend private. Add token exchange. Write allow and deny tests.

Phase 3 is the MCP fleet. Register MCP servers. Catalog tools. Add per-tool policy. Hide unauthorized tools from discovery. Require ownership metadata and versioning.

Phase 4 is writes. Add approval gates, rate limits, budgets, idempotency keys, change windows, and rollback paths. Start with reversible writes before irreversible writes.

Phase 5 is third-party applications. Add OAuth connectors and SaaS token vault patterns. This is where lifecycle and audit matter most because the data often lives outside your network.

Lifecycle diagram showing discover, register, grant, use, audit, and review around an agent-native gateway.
Production is the loop: discover, register, grant, use, audit, review, and revoke. A gateway that cannot revoke cleanly is not production-ready.

What A New Backend Team Should Have To Do

A backend team should be able to publish agent access through a repeatable platform process:

  1. Deploy the backend privately.
  2. Define the operations: routes for APIs, tools for MCP.
  3. Choose scopes: orders:read, orders:write, logs:search.
  4. Define resource relationships: team, project, tenant, ticket, repository, or environment.
  5. Add a gateway route or MCP registration.
  6. Add OpenFGA or SpiceDB relationship rules.
  7. Add request policy for risky actions.
  8. Publish discovery metadata and auth.md updates.
  9. Write tests for allow, deny, expired token, wrong audience, missing scope, missing grant, and revoked session.
  10. Watch audit logs in the first rollout.

That is still work, but it is platform work with a known shape. It is not “every team invents auth.”

Common Mistakes

The most common mistake is forwarding the agent’s broad token to the backend. Use token exchange. Give each backend a token meant only for that backend.

The next mistake is treating MCP access as all-or-nothing. The tool is the permission boundary, not the server.

Another mistake is trusting scopes alone. Scopes are names. They do not know which ticket, repository, customer, tenant, environment, or table is being accessed.

Another mistake is making Dynamic Client Registration fully open. DCR is useful because agents can register themselves. In production, registration still needs guardrails.

Another mistake is building a gateway with no Access Console. If grants require database edits or platform engineer handwork forever, the system will become a bottleneck or drift out of date.

The quiet mistake is weak offboarding. A disabled user must lose delegated agent access. A revoked agent must stop refreshing. A deleted grant must affect the next call. Do not rely only on token expiration.

The Mental Model

An agent-native gateway is a control point that makes agent access boring.

Keep secrets out of agents. Keep runtime-specific details out of backends. Keep tokens out of copy-pasted setup. Keep OAuth out of every service team’s backlog. Give the security team audited tool calls instead of a new blind spot.

The gateway gives the company one sentence it can defend:

This agent, acting for this user, may perform this action on this resource, for this reason, through this private path, and we can revoke it.

That is the production bar.

Further Reading