Introduction

A terminal sidecar for more reliable coding agents

Agent Sidecar sits beside Codex, Claude Code, or another coding agent. It receives hook events, evaluates the run, returns runtime decisions, and gives teams a replayable account of what happened.

Runbook memory

Repo map cache

Trajectory watchdogs

Tool-cost guardrails

Prompt shaping and compaction

Ensemble steering

Install

Install once, run from your terminal

terminal

python -m pip install agent-sidecar
agent-sidecar daemon start
agent-sidecar hook install --provider codex
agent-sidecar hook install --provider claude-code

The terminal process owns hook ingestion and decision responses. The dashboard is a replay and control surface, not the runtime authority.

How it works

Observe, optimize, evaluate, respond, measure

Observe

Hook events from Codex, Claude Code, or another agent are captured as a structured run timeline

Optimize

Vague prompts are rewritten into compact task contracts before the agent spends more steps

Evaluate

The sidecar checks monitor signals, tool-cost guardrails, repo map entries, and approved lessons

Respond

The agent receives one compact action: allow, warn, block, require validation, or inject concise context

Measure

Costs, tokens, decisions, lessons, and outcomes are saved for replay and baseline comparison

Architecture

Local sidecar, runtime decisions

runtime pipeline

Coding agent

Codex, Claude Code, or another agent emits hook events

Agent Sidecar

CLI adapter and local daemon evaluate the run

Runtime decision

warn, block, continue, inject context, or require validation

Sidecar internals

Evaluate before the agent continues

local daemon

Ingest

Normalize hook events and update run state

Remember

Retrieve file memory and approved lessons

Steer

Apply prompt optimization, monitor signals, and low-cost model ensemble guidance

Record

Store traces, decisions, cost signals, and outcomes

Dashboard

Replay UI and control plane

Daemon API

Reads and writes through the sidecar

Local data

Traces, policies, lessons, analytics

Decisions

The agent gets one explicit instruction back

allow

Let a healthy step continue without extra context

warn

Surface a risk while allowing the run to continue

block

Stop dangerous commands or known waste loops

require continuation

Prevent premature final answers when more work is needed

require validation

Ask the agent to run targeted checks before finalizing

inject context

Add a short approved lesson, repo-map hint, or task contract

Cost math

Savings are calculated from comparable runs

Each step records input tokens, output tokens, cache reads, cache writes, model id, and provider pricing. A run cost is the sum of step costs. Savings are reported against a matched unguarded baseline or an explicit A/B control arm.

sidecar_savings = baseline_run_cost - guarded_run_cost
total_cost_saved = sum(sidecar_savings across matched runs)

Reporting note

If there is no matched baseline, the dashboard labels the number as an estimate. If controlled arms exist, the report can show per-arm deltas for cost, tokens, validation rate, and successful completion.