Skip to content

agate-proxy

The proxy bounded context: an inline reverse proxy that inspects LLM-agent traffic and decides — per event — whether to allow, deny, transform, buffer, or terminate it.

agate-proxy is the data plane. The inspection core is protocol-agnostic: the wire protocol (AG-UI first, an agent ↔ LLM adapter later) enters through an adapter that translates wire events into domain events. See the Threat Model for the full design.

Responsibility

  • Terminate TLS and accept the AG-UI request (RunAgentInput).
  • Request leg (preventive): validate, authorize, and size-bound the request before forwarding — reject early so the agent never runs on bad input.
  • Response leg (streaming): parse the SSE event stream incrementally and, per event, apply a verdict — forwarding, redacting, or terminating.
  • Buffer tool-call argument fragments between TOOL_CALL_START and TOOL_CALL_END so a verdict sees the complete arguments.
  • Feed each (event, verdict) to the audit sink, off the forwarding hot path.

The inspection state machine

stateDiagram-v2
    [*] --> RequestValidation
    RequestValidation --> Rejected: invalid / unauthorized / oversized
    RequestValidation --> Streaming: forwarded to agent
    Streaming --> Streaming: Allow / Transform (per event)
    Streaming --> Buffering: TOOL_CALL_START
    Buffering --> Buffering: TOOL_CALL_ARGS
    Buffering --> Streaming: TOOL_CALL_END → verdict on full call
    Streaming --> Terminated: Deny / Terminate (RUN_ERROR)
    Streaming --> [*]: RUN_FINISHED
    Rejected --> [*]
    Terminated --> [*]

The event → verdict seam

Inspection produces, per event (or per buffered logical unit), a verdict:

Verdict Meaning
Allow forward unchanged
Deny(reason) block; on the response leg, surface as RUN_ERROR
Transform(replacement) forward a modified event (e.g. redacted content)
Buffer need more frames before deciding (e.g. mid tool-call)
Terminate(reason) end the run/stream

This single seam is where two contexts plug in via ports: agate-policy computes the verdict (a PolicyPort), and agate-audit records (event, verdict). The proxy depends only on the ports; the concrete policy and audit adapters are injected at the server composition root. The first milestone ships a trivial allow-all policy adapter behind PolicyPort.

Domain language

  • Session / Run — the inspection aggregate(s).
  • InspectedEvent — protocol-agnostic event value objects.
  • Verdict — the decision value object listed above.

Layering

Layer Contents
domain Pure: inspection aggregates, InspectedEvent, Verdict. No I/O.
application Use cases and ports: PolicyPort (verdict source), an audit sink, the upstream agent client, a ProxyMetrics recorder.
infrastructure Adapters: the AG-UI SSE codec (incremental, order-preserving), RunAgentInput validation, the HTTP client to the agent.
presentation HTTP/SSE handlers (axum/hyper), TLS termination, request/response wiring.
setup Composition root: ProxyConfig (AGENT_ENDPOINT, BIND_ADDR), assembly.

Request-leg inspection

Before a run is forwarded, the proxy inspects the incoming RunAgentInput (preventive — the agent never runs on rejected input):

  • Validation — a body that is not valid RunAgentInput JSON is rejected with 400; its size is already bounded (see Configuration).
  • Tool authorization — each tool the client offers is judged by the same PolicyPort as the response leg (projected onto an AgentEvent); an offered tool the policy denies rejects the run with 403.
  • Secret markers — a user message carrying a configured redaction marker is treated as a policy hit and rejected with 403.
  • SSRF guard — user message text is scanned for URLs; a non-http(s) scheme or a loopback / private / link-local host (covering the cloud metadata address) rejects the run with 403. Best-effort and literal-host only — it does not resolve DNS, so DNS-rebinding is out of scope.

Each rejection is recorded to the audit sink, so request-leg denials sit in the transparency log alongside response-leg decisions.

Observability

Data-plane metrics go through a ProxyMetrics port, never counter! from the presentation layer. The run handler and the streaming inspector record agate_runs_total, agate_events_inspected_total{outcome}, and agate_upstream_errors_total through the injected port — so inspect_stream takes the port and is unit-testable with a fake recorder. The real adapter emits through the metrics facade (a no-op until the server installs a Prometheus recorder).

Fail-open vs fail-closed

A real policy may consult external services and could hang. FailModePolicy decorates the PolicyPort, bounding each decision by [policy].decision_timeout_ms and applying the configured mode when it is exceeded: fail-open forwards the event, fail-closed stops it. The default is closed (safety over availability). On the response leg a fail-closed timeout terminates the stream; on the request leg it rejects the run before forwarding. The decorator wraps any policy by composition, so policy implementations never learn about timeouts — see Configuration.