AgentGuard
Stop runaway Python agents before they burn money.
AgentGuard47 is a zero-dependency runtime control SDK for Python agents. Add hard budget caps, loop detection, retry limits, timeouts, local traces, and incident reports without changing agent frameworks or sending data anywhere by default.
Use it when an agent can call tools, retry work, review code, or run long enough to create surprise spend.
⭐ Star this repo if AgentGuard stops one runaway run for you. It is how other builders find it.
Install
As a Python package
pip install agentguard47
agentguardThe bare agentguard command prints a 60-second local tour. If the script is
not on PATH, use python -m agentguard instead. Both run the same CLI.
As a skill (Codex, Claude Code, Cursor, Cline, and more)
npx skills add bmdhodl/agent47Or with GitHub CLI:
gh skill install bmdhodl/agent47 agentguardWhy AgentGuard
Most agent tooling tells you what happened after the run. AgentGuard stops the bad run while it is happening.
AgentGuard is an in-process agentic-loop guard, not an LLM cost router. It runs inside the agent's process, sees the call graph, and raises exceptions that kill the run before the next bad call lands. Routers and gateways like Manifest or Vercel AI Gateway sit at the network layer and shape egress traffic. The layers are complementary, see the competitive notes for when each fits.
The headline value is the cross-call, cross-provider budget envelope: one
ceiling that holds across every tool call, every retry, and every provider in
the run. Single-call output caps are table stakes (Anthropic now ships
per-tool max_tokens on the advisor tool, on 2026-06-02, see
release notes).
A single-call cap stops one oversized response. It does not stop a loop that
makes 200 small calls, a retry storm across providers, or a run that mixes
OpenAI and Anthropic and blows the combined budget. AgentGuard handles that
envelope in-process and raises the exception that ends the run.
| Problem | What AgentGuard does |
|---|---|
| Agent loops on the same tool | Raises LoopDetected |
| Flaky tool retries forever | Raises RetryLimitExceeded |
| Run spends too much across many calls | Raises BudgetExceeded |
| Run mixes OpenAI and Anthropic and blows the combined cap | Raises BudgetExceeded |
| Run hangs | Raises TimeoutExceeded |
| Team needs proof | Writes local JSONL traces and incident reports |
| Dashboard comes later | HttpSink mirrors events only when you opt in |
| Scope of cap | Anthropic per-tool max_tokens | AgentGuard BudgetGuard + RateLimitGuard + TimeoutGuard |
|---|---|---|
| One tool call, output tokens only | Yes | Yes |
| Many calls in one run | No | Yes |
| Mixed providers (OpenAI + Anthropic) in one run | No | Yes |
| Loop detection across repeated calls | No | Yes |
| Retry storm cap | No | Yes |
| Calls per minute | No | Yes |
| Wall-clock timeout | No | Yes |
| In-process exception that ends the run | No | Yes |
Design constraints:
- zero runtime dependencies
- MIT licensed
- local-first by default
- no API key required for local proof
- no network calls unless you configure
HttpSink - guards raise exceptions inside the running process
Scope
AgentGuard's scope is the in-process runtime envelope: budget, token, rate, retry, loop, and timeout caps that fire inside the agent's Python process and raise exceptions that end the run.
OS-level containment is out of scope: process sandboxes, VMs, filesystem boundaries, and egress controls live one layer down from AgentGuard. For that layer, see Anthropic's "How we contain Claude across products" (2026-05-30) as the canonical reference. The layers are complementary: containment bounds what the process can touch; AgentGuard bounds what the agent loop inside that process can spend.
Real Incidents AgentGuard Prevents
PocketOS — agent deleted prod DB and backups in 9 seconds (May 2026)
A Cursor agent ran a destructive sequence against PocketOS production and wiped the live database. Backups went with it.
Reported root cause from the team's postmortem:
- one API key had write + delete on both prod and backups
- backups lived in the same Railway environment as prod
- no confirmation step before destructive actions
- the agent was given enough rope to chain the calls in one turn
Source: r/devops thread
The "AI did it" framing buries the actual lesson: the blast radius was infra, not the model. AgentGuard does not replace least-privilege creds or isolated backups. It does kill the run before a loop, retry storm, or runaway turn finishes the job.
A BudgetGuard plus LoopGuard wired around the agent loop caps how much
it can do in one session:
from agentguard import BudgetGuard, LoopGuard, RateLimitGuard, Tracer
budget = BudgetGuard(max_calls=20, max_cost_usd=1.00)
loop = LoopGuard(max_repeats=2)
rate = RateLimitGuard(max_calls_per_minute=10)
tracer = Tracer(service="cursor-agent", guards=[loop, rate])
with tracer.trace("agent.run"):
budget.consume(calls=1)
# tool call here — guards raise on overrunA 9-second sequence of destructive calls trips LoopGuard or
RateLimitGuard long before it finishes. The exception kills the run
in-process. Pair this with scoped credentials and out-of-environment
backups for the rest of the blast radius.
Microsoft — engineers told to ease off Claude Code over inference cost (May 2026)
Microsoft engineering management reportedly asked teams to reduce Claude Code usage after monthly inference bills exceeded budget. If Microsoft cannot absorb coding-agent inference cost without a memo, runaway agent spend is no longer a solo-founder problem.
Source: TheNextWeb
A memo asks engineers to self-throttle. A BudgetGuard makes the cap a
config value enforced inside the process:
from agentguard import BudgetGuard
BudgetGuard(max_cost_usd=5.00, max_calls=50, warn_at_pct=0.8)The guard raises BudgetExceeded before the run blows the cap. Same
conversation, one config line instead of a memo.
Local Proof in 60 Seconds
pip install agentguard47
agentguard doctor
agentguard demo
agentguard quickstart --framework raw --write
python agentguard_raw_quickstart.py
agentguard report .agentguard/traces.jsonlThis stays fully local. No API key, dashboard, or network call is required after installation.
What you should see:
doctorverifies the installed package and writes a local trace.demovisibly trips budget, loop, and retry guards offline.quickstart --writecreatesagentguard_raw_quickstart.py.- The generated file exits cleanly after catching simulated budget and loop stops.
reportshows local trace counts, cost, savings, and guard events.
Show Your Repo Is Guarded
Once AgentGuard stops a runaway run for you, add the badge to your README so other builders find it:
agentguard badge[](https://github.com/bmdhodl/agent47)agentguard badge --format rst and --format html print the same badge for
other doc formats.
Copy-Paste Repo Setup
Use this when you want a coding agent or teammate to add AgentGuard safely:
pip install agentguard47
agentguard doctor
agentguard quickstart --framework raw --write
python agentguard_raw_quickstart.py
agentguard report .agentguard/traces.jsonlOptional shared local defaults, saved as .agentguard.json in the repo root:
{
"profile": "coding-agent",
"service": "my-agent",
"trace_file": ".agentguard/traces.jsonl",
"budget_usd": 5.0
}Keep the first PR local-only. Add hosted ingest later only when retained incidents, alerts, or team visibility matter.
Quickstart: Guard One Agent Run
from agentguard import BudgetGuard, JsonlFileSink, LoopGuard, Tracer
budget = BudgetGuard(max_cost_usd=5.00, max_calls=50, warn_at_pct=0.8)
loop = LoopGuard(max_repeats=3)
tracer = Tracer(
sink=JsonlFileSink(".agentguard/traces.jsonl"),
service="support-agent",
guards=[loop],
)
with tracer.trace("agent.run") as span:
budget.consume(calls=1, cost_usd=0.02)
loop.check("search", {"query": "refund policy"})
span.event("tool.call", data={"tool": "search", "query": "refund policy"})
# Call your agent or tool here.Inspect the local proof:
agentguard report .agentguard/traces.jsonl
agentguard incident .agentguard/traces.jsonlAuto-Patch Provider SDKs
If you already call OpenAI or Anthropic directly, patch once and keep using the provider normally:
from agentguard import BudgetGuard, Tracer, patch_openai
budget = BudgetGuard(max_cost_usd=5.00, warn_at_pct=0.8)
tracer = Tracer(service="support-agent")
patch_openai(tracer, budget_guard=budget)
# OpenAI chat completions are now traced and budget-enforced.When accumulated cost crosses the hard limit, BudgetExceeded is raised and
the agent stops.
Guards
| Guard | Stops | Example |
|---|---|---|
BudgetGuard | dollar, token, or call overruns | BudgetGuard(max_cost_usd=5.00) |
LoopGuard | exact repeated tool calls | LoopGuard(max_repeats=3) |
FuzzyLoopGuard | similar calls and A-B-A-B loops | FuzzyLoopGuard(max_tool_repeats=5) |
RetryGuard | retry storms on the same tool | RetryGuard(max_retries=3) |
TimeoutGuard | long-running jobs | TimeoutGuard(max_seconds=300) |
RateLimitGuard | calls per minute | RateLimitGuard(max_calls_per_minute=60) |
BudgetAwareEscalation | hard turns that need a stronger model | BudgetAwareEscalation(...) |
Guards are static runtime checks. They do not ask another model whether a run is safe. They raise exceptions.
Examples
All examples are local-first. No API key is required unless the example says so.
| Example | What it proves |
|---|---|
examples/try_it_now.py | budget, loop, and retry stops |
examples/sticky_agent_proof.py | one CrewAI-style retry storm proof with local incident and hosted NDJSON outputs |
examples/coding_agent_review_loop.py | review/refinement loop stopped by budget and retry guards |
examples/per_token_budget_spike.py | one oversized token-heavy turn can blow a run budget |
…