agate-proxy¶
The proxy bounded context: an inline reverse proxy that inspects LLM-agent traffic and decides — per event — whether to allow, deny, transform, buffer, or terminate it.
agate-proxy is the data plane. The inspection core is
protocol-agnostic: the wire protocol (AG-UI first, an agent ↔ LLM adapter
later) enters through an adapter that translates wire events into domain
events. See the Threat Model for the full design.
Responsibility¶
- Terminate TLS and accept the AG-UI request (
RunAgentInput). - Request leg (preventive): validate, authorize, and size-bound the request before forwarding — reject early so the agent never runs on bad input.
- Response leg (streaming): parse the SSE event stream incrementally and, per event, apply a verdict — forwarding, redacting, or terminating.
- Buffer tool-call argument fragments between
TOOL_CALL_STARTandTOOL_CALL_ENDso a verdict sees the complete arguments. - Feed each
(event, verdict)to the audit sink, off the forwarding hot path.
The inspection state machine¶
stateDiagram-v2
[*] --> RequestValidation
RequestValidation --> Rejected: invalid / unauthorized / oversized
RequestValidation --> Streaming: forwarded to agent
Streaming --> Streaming: Allow / Transform (per event)
Streaming --> Buffering: TOOL_CALL_START
Buffering --> Buffering: TOOL_CALL_ARGS
Buffering --> Streaming: TOOL_CALL_END → verdict on full call
Streaming --> Terminated: Deny / Terminate (RUN_ERROR)
Streaming --> [*]: RUN_FINISHED
Rejected --> [*]
Terminated --> [*]
The event → verdict seam¶
Inspection produces, per event (or per buffered logical unit), a verdict:
| Verdict | Meaning |
|---|---|
Allow |
forward unchanged |
Deny(reason) |
block; on the response leg, surface as RUN_ERROR |
Transform(replacement) |
forward a modified event (e.g. redacted content) |
Buffer |
need more frames before deciding (e.g. mid tool-call) |
Terminate(reason) |
end the run/stream |
This single seam is where two contexts plug in via ports:
agate-policy computes the verdict (a PolicyPort), and
agate-audit records (event, verdict). The proxy depends only
on the ports; the concrete policy and audit adapters are injected at the
server composition root. The first milestone ships a trivial
allow-all policy adapter behind PolicyPort.
Domain language¶
Session/Run— the inspection aggregate(s).InspectedEvent— protocol-agnostic event value objects.Verdict— the decision value object listed above.
Layering¶
| Layer | Contents |
|---|---|
domain |
Pure: inspection aggregates, InspectedEvent, Verdict. No I/O. |
application |
Use cases and ports: PolicyPort (verdict source), an audit sink, the upstream agent client, a ProxyMetrics recorder. |
infrastructure |
Adapters: the AG-UI SSE codec (incremental, order-preserving), RunAgentInput validation, the HTTP client to the agent. |
presentation |
HTTP/SSE handlers (axum/hyper), TLS termination, request/response wiring. |
setup |
Composition root: ProxyConfig (AGENT_ENDPOINT, BIND_ADDR), assembly. |
Request-leg inspection¶
Before a run is forwarded, the proxy inspects the incoming RunAgentInput
(preventive — the agent never runs on rejected input):
- Validation — a body that is not valid
RunAgentInputJSON is rejected with400; its size is already bounded (see Configuration). - Tool authorization — each tool the client offers is judged by the same
PolicyPortas the response leg (projected onto anAgentEvent); an offered tool the policy denies rejects the run with403. - Secret markers — a user message carrying a configured redaction marker is
treated as a policy hit and rejected with
403. - SSRF guard — user message text is scanned for URLs; a non-
http(s)scheme or a loopback / private / link-local host (covering the cloud metadata address) rejects the run with403. Best-effort and literal-host only — it does not resolve DNS, so DNS-rebinding is out of scope.
Each rejection is recorded to the audit sink, so request-leg denials sit in the transparency log alongside response-leg decisions.
Observability¶
Data-plane metrics go through a ProxyMetrics port, never counter! from
the presentation layer. The run handler and the streaming inspector record
agate_runs_total, agate_events_inspected_total{outcome}, and
agate_upstream_errors_total through the injected port — so inspect_stream
takes the port and is unit-testable with a fake recorder. The real adapter
emits through the metrics facade (a no-op until the server
installs a Prometheus recorder).
Fail-open vs fail-closed¶
A real policy may consult external services and could hang. FailModePolicy
decorates the PolicyPort, bounding each decision by [policy].decision_timeout_ms
and applying the configured mode when it is exceeded: fail-open forwards the
event, fail-closed stops it. The default is closed (safety over
availability). On the response leg a fail-closed timeout terminates the stream;
on the request leg it rejects the run before forwarding. The decorator wraps any policy by composition, so policy
implementations never learn about timeouts — see
Configuration.