AgentGuard

Stop runaway Python agents before they burn money.

AgentGuard47 is a zero-dependency runtime control SDK for Python agents. Add hard budget caps, loop detection, retry limits, timeouts, local traces, and incident reports without changing agent frameworks or sending data anywhere by default.

Use it when an agent can call tools, retry work, review code, or run long enough to create surprise spend.

⭐ Star this repo if AgentGuard stops one runaway run for you. It is how other builders find it.

Install

As a Python package

pip install agentguard47
agentguard

The bare agentguard command prints a 60-second local tour. If the script is not on PATH, use python -m agentguard instead. Both run the same CLI.

As a skill (Codex, Claude Code, Cursor, Cline, and more)

npx skills add bmdhodl/agent47

Or with GitHub CLI:

gh skill install bmdhodl/agent47 agentguard

Why AgentGuard

Most agent tooling tells you what happened after the run. AgentGuard stops the bad run while it is happening.

AgentGuard is an in-process agentic-loop guard, not an LLM cost router. It runs inside the agent's process, sees the call graph, and raises exceptions that kill the run before the next bad call lands. Routers and gateways like Manifest or Vercel AI Gateway sit at the network layer and shape egress traffic. The layers are complementary, see the competitive notes for when each fits.

The headline value is the cross-call, cross-provider budget envelope: one ceiling that holds across every tool call, every retry, and every provider in the run. Single-call output caps are table stakes (Anthropic now ships per-tool max_tokens on the advisor tool, on 2026-06-02, see release notes). A single-call cap stops one oversized response. It does not stop a loop that makes 200 small calls, a retry storm across providers, or a run that mixes OpenAI and Anthropic and blows the combined budget. AgentGuard handles that envelope in-process and raises the exception that ends the run.

Problem	What AgentGuard does
Agent loops on the same tool	Raises `LoopDetected`
Flaky tool retries forever	Raises `RetryLimitExceeded`
Run spends too much across many calls	Raises `BudgetExceeded`
Run mixes OpenAI and Anthropic and blows the combined cap	Raises `BudgetExceeded`
Run hangs	Raises `TimeoutExceeded`
Team needs proof	Writes local JSONL traces and incident reports
Dashboard comes later	`HttpSink` mirrors events only when you opt in

Scope of cap	Anthropic per-tool `max_tokens`	AgentGuard `BudgetGuard` + `RateLimitGuard` + `TimeoutGuard`
One tool call, output tokens only	Yes	Yes
Many calls in one run	No	Yes
Mixed providers (OpenAI + Anthropic) in one run	No	Yes
Loop detection across repeated calls	No	Yes
Retry storm cap	No	Yes
Calls per minute	No	Yes
Wall-clock timeout	No	Yes
In-process exception that ends the run	No	Yes

Design constraints:

zero runtime dependencies
MIT licensed
local-first by default
no API key required for local proof
no network calls unless you configure HttpSink
guards raise exceptions inside the running process

Scope

AgentGuard's scope is the in-process runtime envelope: budget, token, rate, retry, loop, and timeout caps that fire inside the agent's Python process and raise exceptions that end the run.

OS-level containment is out of scope: process sandboxes, VMs, filesystem boundaries, and egress controls live one layer down from AgentGuard. For that layer, see Anthropic's "How we contain Claude across products" (2026-05-30) as the canonical reference. The layers are complementary: containment bounds what the process can touch; AgentGuard bounds what the agent loop inside that process can spend.

Real Incidents AgentGuard Prevents

PocketOS — agent deleted prod DB and backups in 9 seconds (May 2026)

A Cursor agent ran a destructive sequence against PocketOS production and wiped the live database. Backups went with it.

Reported root cause from the team's postmortem:

one API key had write + delete on both prod and backups
backups lived in the same Railway environment as prod
no confirmation step before destructive actions
the agent was given enough rope to chain the calls in one turn

Source: r/devops thread

The "AI did it" framing buries the actual lesson: the blast radius was infra, not the model. AgentGuard does not replace least-privilege creds or isolated backups. It does kill the run before a loop, retry storm, or runaway turn finishes the job.

A BudgetGuard plus LoopGuard wired around the agent loop caps how much it can do in one session:

from agentguard import BudgetGuard, LoopGuard, RateLimitGuard, Tracer

budget = BudgetGuard(max_calls=20, max_cost_usd=1.00)
loop = LoopGuard(max_repeats=2)
rate = RateLimitGuard(max_calls_per_minute=10)
tracer = Tracer(service="cursor-agent", guards=[loop, rate])

with tracer.trace("agent.run"):
    budget.consume(calls=1)
    # tool call here — guards raise on overrun

A 9-second sequence of destructive calls trips LoopGuard or RateLimitGuard long before it finishes. The exception kills the run in-process. Pair this with scoped credentials and out-of-environment backups for the rest of the blast radius.

Microsoft — engineers told to ease off Claude Code over inference cost (May 2026)

Microsoft engineering management reportedly asked teams to reduce Claude Code usage after monthly inference bills exceeded budget. If Microsoft cannot absorb coding-agent inference cost without a memo, runaway agent spend is no longer a solo-founder problem.

Source: TheNextWeb

A memo asks engineers to self-throttle. A BudgetGuard makes the cap a config value enforced inside the process:

from agentguard import BudgetGuard

BudgetGuard(max_cost_usd=5.00, max_calls=50, warn_at_pct=0.8)

The guard raises BudgetExceeded before the run blows the cap. Same conversation, one config line instead of a memo.

Local Proof in 60 Seconds

pip install agentguard47
agentguard doctor
agentguard demo
agentguard quickstart --framework raw --write
python agentguard_raw_quickstart.py
agentguard report .agentguard/traces.jsonl

This stays fully local. No API key, dashboard, or network call is required after installation.

What you should see:

doctor verifies the installed package and writes a local trace.
demo visibly trips budget, loop, and retry guards offline.
quickstart --write creates agentguard_raw_quickstart.py.
The generated file exits cleanly after catching simulated budget and loop stops.
report shows local trace counts, cost, savings, and guard events.

Notebook version:

Show Your Repo Is Guarded

Once AgentGuard stops a runaway run for you, add the badge to your README so other builders find it:

agentguard badge

[![Guarded by AgentGuard](https://img.shields.io/badge/guarded%20by-AgentGuard-3b82f6)](https://github.com/bmdhodl/agent47)

agentguard badge --format rst and --format html print the same badge for other doc formats.

Copy-Paste Repo Setup

Use this when you want a coding agent or teammate to add AgentGuard safely:

pip install agentguard47
agentguard doctor
agentguard quickstart --framework raw --write
python agentguard_raw_quickstart.py
agentguard report .agentguard/traces.jsonl

Optional shared local defaults, saved as .agentguard.json in the repo root:

{
  "profile": "coding-agent",
  "service": "my-agent",
  "trace_file": ".agentguard/traces.jsonl",
  "budget_usd": 5.0
}

Keep the first PR local-only. Add hosted ingest later only when retained incidents, alerts, or team visibility matter.

Quickstart: Guard One Agent Run

from agentguard import BudgetGuard, JsonlFileSink, LoopGuard, Tracer

budget = BudgetGuard(max_cost_usd=5.00, max_calls=50, warn_at_pct=0.8)
loop = LoopGuard(max_repeats=3)
tracer = Tracer(
    sink=JsonlFileSink(".agentguard/traces.jsonl"),
    service="support-agent",
    guards=[loop],
)

with tracer.trace("agent.run") as span:
    budget.consume(calls=1, cost_usd=0.02)
    loop.check("search", {"query": "refund policy"})
    span.event("tool.call", data={"tool": "search", "query": "refund policy"})
    # Call your agent or tool here.

Inspect the local proof:

agentguard report .agentguard/traces.jsonl
agentguard incident .agentguard/traces.jsonl

Auto-Patch Provider SDKs

If you already call OpenAI or Anthropic directly, patch once and keep using the provider normally:

from agentguard import BudgetGuard, Tracer, patch_openai

budget = BudgetGuard(max_cost_usd=5.00, warn_at_pct=0.8)
tracer = Tracer(service="support-agent")
patch_openai(tracer, budget_guard=budget)

# OpenAI chat completions are now traced and budget-enforced.

When accumulated cost crosses the hard limit, BudgetExceeded is raised and the agent stops.

Guards

Guard	Stops	Example
`BudgetGuard`	dollar, token, or call overruns	`BudgetGuard(max_cost_usd=5.00)`
`LoopGuard`	exact repeated tool calls	`LoopGuard(max_repeats=3)`
`FuzzyLoopGuard`	similar calls and A-B-A-B loops	`FuzzyLoopGuard(max_tool_repeats=5)`
`RetryGuard`	retry storms on the same tool	`RetryGuard(max_retries=3)`
`TimeoutGuard`	long-running jobs	`TimeoutGuard(max_seconds=300)`
`RateLimitGuard`	calls per minute	`RateLimitGuard(max_calls_per_minute=60)`
`BudgetAwareEscalation`	hard turns that need a stronger model	`BudgetAwareEscalation(...)`

Guards are static runtime checks. They do not ask another model whether a run is safe. They raise exceptions.

Examples

All examples are local-first. No API key is required unless the example says so.

Example	What it proves
`examples/try_it_now.py`	budget, loop, and retry stops
`examples/sticky_agent_proof.py`	one CrewAI-style retry storm proof with local incident and hosted NDJSON outputs
`examples/coding_agent_review_loop.py`	review/refinement loop stopped by budget and retry guards
`examples/per_token_budget_spike.py`	one oversized token-heavy turn can blow a run budget

…

Agent47

Installation

Configuration

How to use

README