agate-proxy¶

The proxy bounded context: an inline reverse proxy that inspects LLM-agent traffic and decides — per event — whether to allow, deny, transform, buffer, or terminate it.

agate-proxy is the data plane. The inspection core is protocol-agnostic: the wire protocol (AG-UI first, an agent ↔ LLM adapter later) enters through an adapter that translates wire events into domain events. See the Threat Model for the full design.

Responsibility¶

Terminate TLS and accept the AG-UI request (RunAgentInput).
Request leg (preventive): validate, authorize, and size-bound the request before forwarding — reject early so the agent never runs on bad input.
Response leg (streaming): parse the SSE event stream incrementally and, per event, apply a verdict — forwarding, redacting, or terminating.
Buffer tool-call argument fragments between TOOL_CALL_START and TOOL_CALL_END so a verdict sees the complete arguments.
Feed each (event, verdict) to the audit sink, off the forwarding hot path.

The inspection state machine¶

stateDiagram-v2
    [*] --> RequestValidation
    RequestValidation --> Rejected: invalid / unauthorized / oversized
    RequestValidation --> Streaming: forwarded to agent
    Streaming --> Streaming: Allow / Transform (per event)
    Streaming --> Buffering: TOOL_CALL_START
    Buffering --> Buffering: TOOL_CALL_ARGS
    Buffering --> Streaming: TOOL_CALL_END → verdict on full call
    Streaming --> Terminated: Deny / Terminate (RUN_ERROR)
    Streaming --> [*]: RUN_FINISHED
    Rejected --> [*]
    Terminated --> [*]

The event → verdict seam¶

Inspection produces, per event (or per buffered logical unit), a verdict:

Verdict	Meaning
`Allow`	forward unchanged
`Deny(reason)`	block; on the response leg, surface as `RUN_ERROR`
`Transform(replacement)`	forward a modified event (e.g. redacted content)
`Buffer`	need more frames before deciding (e.g. mid tool-call)
`Terminate(reason)`	end the run/stream

This single seam is where two contexts plug in via ports: agate-policy computes the verdict (a PolicyPort), and agate-audit records (event, verdict). The proxy depends only on the ports; the concrete policy and audit adapters are injected at the server composition root. The first milestone ships a trivial allow-all policy adapter behind PolicyPort.

Domain language¶

Session / Run — the inspection aggregate(s).
InspectedEvent — protocol-agnostic event value objects.
Verdict — the decision value object listed above.

Layering¶

Layer	Contents
`domain`	Pure: inspection aggregates, `InspectedEvent`, `Verdict`. No I/O.
`application`	Use cases and ports: `PolicyPort` (verdict source), an audit sink, the upstream agent client, a `ProxyMetrics` recorder.
`infrastructure`	Adapters: the AG-UI SSE codec (incremental, order-preserving), `RunAgentInput` validation, the HTTP client to the agent.
`presentation`	HTTP/SSE handlers (axum/hyper), TLS termination, request/response wiring.
`setup`	Composition root: `ProxyConfig` (`AGENT_ENDPOINT`, `BIND_ADDR`), assembly.

Request-leg inspection¶

Before a run is forwarded, the proxy inspects the incoming RunAgentInput (preventive — the agent never runs on rejected input):

Validation — a body that is not valid RunAgentInput JSON is rejected with 400; its size is already bounded (see Configuration).
Tool authorization — each tool the client offers is judged by the same PolicyPort as the response leg (projected onto an AgentEvent); an offered tool the policy denies rejects the run with 403.
Secret markers — a user message carrying a configured redaction marker is treated as a policy hit and rejected with 403.
SSRF guard — user message text is scanned for URLs; a non-http(s) scheme or a loopback / private / link-local host (covering the cloud metadata address) rejects the run with 403. Best-effort and literal-host only — it does not resolve DNS, so DNS-rebinding is out of scope.

Each rejection is recorded to the audit sink, so request-leg denials sit in the transparency log alongside response-leg decisions.

Observability¶

Data-plane metrics go through a ProxyMetrics port, never counter! from the presentation layer. The run handler and the streaming inspector record agate_runs_total, agate_events_inspected_total{outcome}, and agate_upstream_errors_total through the injected port — so inspect_stream takes the port and is unit-testable with a fake recorder. The real adapter emits through the metrics facade (a no-op until the server installs a Prometheus recorder).

Fail-open vs fail-closed¶

A real policy may consult external services and could hang. FailModePolicy decorates the PolicyPort, bounding each decision by [policy].decision_timeout_ms and applying the configured mode when it is exceeded: fail-open forwards the event, fail-closed stops it. The default is closed (safety over availability). On the response leg a fail-closed timeout terminates the stream; on the request leg it rejects the run before forwarding. The decorator wraps any policy by composition, so policy implementations never learn about timeouts — see Configuration.