All Use Cases
Claude Code setups for observability and incident response
Pull logs, metrics, and traces directly into Claude Code. MCP servers for Sentry, Datadog, Grafana, New Relic, Axiom, and PagerDuty — plus agents that triage production incidents and propose fixes with the data already in context.
112 items across skills, agents, plugins, MCP servers, prompts, hooks, and guides.
Example prompts
Concrete things you can type into Claude Code for observability & monitoring. Each example uses a real skill, agent, slash command, hook, or MCP server listed below.
- Triage the top 5 unresolved Sentry issues
Connect the Sentry MCP server and walk through the top 5 unresolved production issues, grouping by likely root cause.
- Investigate a latency spike on Grafana
Use the Grafana MCP server with the observability-engineer agent to find what changed at 14:03 UTC on the checkout dashboard.
- Add structured logging to a Node service
Use the observability-engineer agent to wire pino with request IDs and OTEL trace correlation across the API.
- Wire Claude into your on-call rotation
Connect the PagerDuty MCP, fetch the active incident, and have Claude propose the next runbook step.
Skills (16)
Browse all skillsSenior Devops
Comprehensive DevOps skill for CI/CD, infrastructure automation, containerization, and cloud platforms (AWS, GCP, Azure). Includes pipeline setup, infrastructure as code, deployment automation, and monitoring. Use when setting up pipelines, deploying applications, managing infra…
awsgcpazuredeployment+3
Senior Ml Engineer
ML engineering skill for productionizing models, building MLOps pipelines, and integrating LLMs. Covers model deployment, feature stores, drift monitoring, RAG systems, and cost optimization. Use when the user asks about deploying ML models to production, setting up MLOps infras…
kubernetesdockerperformancedeployment+5
Quality Manager Qmr
Senior Quality Manager Responsible Person (QMR) for HealthTech and MedTech companies. Provides quality system governance, management review leadership, regulatory compliance oversight, and quality performance monitoring per ISO 13485 Clause 5.5.2.
goperformancemonitoring
Metrics Dashboard
Define and design a product metrics dashboard with key metrics, data sources, visualization types, and alert thresholds. Use when creating a metrics dashboard, defining KPIs, setting up product analytics, or building a data monitoring plan.
monitoring
Api Error Handling
Implements standardized API error responses with proper status codes, logging, and user-friendly messages. Use when building production APIs, implementing error recovery patterns, or integrating error monitoring services.
monitoringapi
Chrome Devtools
Browser automation with Puppeteer CLI scripts. Use for screenshots, performance analysis, network monitoring, web scraping, form automation, or encountering JavaScript debugging, browser automation errors.
javascriptperformancemonitoringbrowser+3
Cloudflare Workers Observability
Cloudflare Workers observability with logging, Analytics Engine, Tail Workers, metrics, and alerting. Use for monitoring, debugging, tracing, or encountering log parsing, metric aggregation, alert configuration errors.
cloudflaremonitoringai
Health Check Endpoints
Health check endpoints for liveness, readiness, dependency monitoring. Use for Kubernetes, load balancers, auto-scaling, or encountering probe failures, startup delays, dependency checks, timeout configuration errors.
kubernetesmonitoringai
Model Deployment
Deploy ML models with FastAPI, Docker, Kubernetes. Use for serving predictions, containerization, monitoring, drift detection, or encountering latency issues, health check failures, version conflicts.
kubernetesdockerdeploymentmonitoring+2
Recommendation System
Deploy production recommendation systems with feature stores, caching, A/B testing. Use for personalization APIs, low latency serving, or encountering cache invalidation, experiment tracking, quality monitoring issues.
testingmonitoringapi
Web Vitals Monitor
Monitor web vitals monitor operations. Auto-activating skill for Frontend Development. Triggers on: web vitals monitor, web vitals monitor Part of the Frontend Development skill category. Use when monitoring systems or services. Trigger with phrases like "web vitals monitor", "w…
gomonitoring
Prediction Monitor
Monitor prediction monitor operations. Auto-activating skill for ML Deployment. Triggers on: prediction monitor, prediction monitor Part of the ML Deployment skill category. Use when monitoring systems or services. Trigger with phrases like "prediction monitor", "prediction moni…
godeploymentmonitoring
Pipeline Monitoring Setup
Configure pipeline monitoring setup operations. Auto-activating skill for Data Pipelines. Triggers on: pipeline monitoring setup, pipeline monitoring setup Part of the Data Pipelines skill category. Use when monitoring systems or services. Trigger with phrases like "pipeline mon…
gomonitoring
Cloud Monitoring Alert
Monitor cloud monitoring alert operations. Auto-activating skill for GCP Skills. Triggers on: cloud monitoring alert, cloud monitoring alert Part of the GCP Skills skill category. Use when monitoring systems or services. Trigger with phrases like "cloud monitoring alert", "cloud…
gogcpmonitoring
Sla Monitor Setup
Configure sla monitor setup operations. Auto-activating skill for Enterprise Workflows. Triggers on: sla monitor setup, sla monitor setup Part of the Enterprise Workflows skill category. Use when monitoring systems or services. Trigger with phrases like "sla monitor setup", "sla…
gomonitoring
Canary
Post-deploy canary monitoring — checks site health, detects regressions, monitors for errors after deployment. Use after deploying to verify production is healthy.
deploymentmonitoring
Agents (4)
Browse all agentsDevOps Incident Responder
Production incident management specialist for rapid triage, root cause analysis, and preventive measures
Infrastructuredevopsincident-response+3
MLOps Engineer
Machine learning operations specialist for model training pipelines, deployment, and monitoring
Data & AImlopsmachine-learning+4
SRE Engineer
Site Reliability Engineering specialist for observability, incident response, SLOs, and production system reliability
Infrastructuresrereliability+4
Observability Engineer
Instrumentation specialist who adds meaningful metrics, logs, and traces — and designs dashboards and alerts that surface the right signal at the right time
Infrastructureobservabilitymonitoring+6
Plugins (31)
Browse all pluginsDatadog
Datadog observability platform integration. Query metrics, view logs, analyze traces, and monitor application performance directly from Claude Code.
monitoringobservabilityapmlogs+1
Sentry
Sentry error monitoring integration. Track errors, view stack traces, analyze crash reports, and get real-time alerts for application issues.
monitoringerrorsdebuggingobservability+1
Amplitude
Use Amplitude as an expert analyst — instrument Amplitude, discover product opportunities, analyze charts, create dashboards, manage experiments, and understand users and accounts.
monitoring
Dash0
OpenTelemetry observability for Claude Code sessions. Captures tool calls, LLM invocations, token usage, and errors as OTel traces. Send telemetry to Dash0 or any OpenTelemetry-compatible backend.
monitoringllm
Datarobot Agent Skills
DataRobot skills for AI/ML workflows — model training, deployment, predictions, feature engineering, monitoring, explainability, data preparation, App Framework CI/CD, and external agent monitoring.
developmentdeploymentmonitoringai+1
Fullstory
Connect Claude to Fullstory to query behavioral analytics, session replays, and customer experience insights.
monitoring
Logfire
Add Logfire observability to Python applications with auto-instrumentation for FastAPI, httpx, asyncpg, SQLAlchemy, and more
monitoringpythonapi
Pagerduty
Enhance code quality and security through PagerDuty risk scoring and incident correlation. Score pre-commit diffs against historical incident data and surface deployment risk before you ship.
monitoringsecuritydeploymentai
Posthog
Access PostHog analytics, feature flags, experiments, error tracking, and insights directly from Claude Code.
monitoring
Qdrant Skills
Agent skills for Qdrant vector search covering scaling, performance optimization, search quality, monitoring, deployment, model migration, version upgrades, and SDK usage across Python, TypeScript, Rust, Go, .NET, and Java.
databasepythontypescriptrust+5
Railway
Deploy and manage apps, databases, and infrastructure on Railway. Covers project setup, deploys, environment configuration, networking, troubleshooting, and monitoring.
deploymentmonitoringai
Zilliz
Zilliz Cloud management plugin with 14 skills covering cluster lifecycle, collection schema, vector search, index tuning, bulk import, RBAC, backups, and monitoring.
databasemonitoring
Alerting Rule Creator
Create intelligent alerting rules for performance monitoring
performancemonitoring
Api Monitoring Dashboard
Create monitoring dashboards for API health, metrics, and alerts
api-developmentmonitoringapi
Apm Dashboard Creator
Create Application Performance Monitoring dashboards
performancemonitoring
Fairdb Operations Kit
Complete operations kit for FairDB PostgreSQL as a Service - VPS setup, PostgreSQL management, customer provisioning, monitoring, and backup automation
devopspostgresmonitoringautomation+1
Mempool Analyzer
Advanced mempool analysis for MEV opportunities, pending transaction monitoring, and gas price optimization
cryptomonitoring
Monitoring Stack Deployer
Deploy monitoring stacks (Prometheus, Grafana, Datadog)
devopsmonitoring
Real User Monitoring
Implement Real User Monitoring for actual performance data
performancemonitoring
Synthetic Monitoring Setup
Set up synthetic monitoring for proactive performance tracking
performancemonitoring
Sentry Pack
Complete Sentry integration skill pack with 30 skills covering error monitoring, performance tracking, session replay, and observability. Flagship+ tier vendor pack.
saas-packsperformancemonitoring
Observability Monitoring
Metrics collection, logging infrastructure, distributed tracing, SLO implementation, and monitoring dashboards
operationsmonitoring
Data Validation Suite
Schema validation, data quality monitoring, streaming validation pipelines, and input validation for backend APIs
datamonitoringapi
Api Testing Observability
API testing automation, request mocking, OpenAPI documentation generation, observability setup, and monitoring
apitestingmonitoringautomation
Seo Analysis Monitoring
Content freshness analysis, cannibalization detection, and authority building for SEO
marketingmonitoring
Conserve
Context optimization, bloat detection, CPU/GPU monitoring, and token conservation for efficient Claude Code sessions
monitoring
Api Error Handling
Implements standardized API error responses with proper status codes, logging, and user-friendly messages. Use when building production APIs, implementing error recovery patterns, or integrating error monitoring services.
apimonitoring
Health Check Endpoints
Health check endpoints for liveness, readiness, dependency monitoring. Use for Kubernetes, load balancers, auto-scaling, or encountering probe failures, startup delays, dependency checks, timeout configuration errors.
architecturekubernetesmonitoringai
Model Deployment
Deploy ML models with FastAPI, Docker, Kubernetes. Use for serving predictions, containerization, monitoring, drift detection, or encountering latency issues, health check failures, version conflicts.
aikubernetesdockerdeployment+2
Recommendation System
Deploy production recommendation systems with feature stores, caching, A/B testing. Use for personalization APIs, low latency serving, or encountering cache invalidation, experiment tracking, quality monitoring issues.
datatestingmonitoringapi
Web Performance Optimization
Optimizes web application performance through code splitting, lazy loading, caching strategies, and Core Web Vitals monitoring. Use when improving page load times, implementing service workers, or reducing bundle sizes.
webperformancemonitoring
MCP Servers (61)
Browse all MCP serversSentry Server
Retrieve and analyze issues, errors, and stacktraces from Sentry
sentryerrorsmonitoringdebugging+1
Grafana Server
Query Grafana dashboards, alerts, and datasources for observability and monitoring workflows
monitoringgrafanaobservabilitydashboards+1
Elasticsearch Server
Search, index, and analyze data in Elasticsearch clusters with natural language queries
searchelasticsearchanalyticslogging+1
Datadog Server
Query metrics, logs, and traces from Datadog for observability and monitoring insights
datadogmonitoringobservabilitylogs+2
Raygun Server
Access crash reporting and real user monitoring data for debugging and performance analysis
raygunmonitoringcrash-reportingapm+1
PagerDuty Server
Manage incidents, on-call schedules, and alerts through PagerDuty's incident management platform
pagerdutyincidentson-callalerting+2
Axiom
Query and analyze Axiom logs, traces, and event data using natural language with APL (Axiom Processing Language)
observabilityaxiomlogsmonitoring+1
New Relic
Query telemetry data, investigate alerts, analyze application performance, and manage dashboards in New Relic
monitoringnewrelicapmobservability+1
Agentforge
Unified API gateway and marketplace for 300+ AI agents. One API key, REST + streaming, 90% creator revenue share, health monitoring. Self-hostable (MIT).
aggregatorsmonitoringapirest+2
Firefox Devtools
Firefox browser automation via WebDriver BiDi for testing, scraping, and browser control. Supports snapshot/UID-based interactions, network monitoring, console capture, and screenshots.
browser-automationtestingmonitoringbrowser+3
Comet
Connect to Perplexity Comet browser for agentic web browsing, deep research, and real-time task monitoring.
browser-automationmonitoringbrowseragent
Portainer
A powerful MCP server that enables AI assistants to seamlessly interact with Portainer instances, providing natural language access to container management, deployment operations, and infrastructure monitoring capabilities.
cloud-platformsdeploymentmonitoringai
Lumino
AI-powered SRE observability for Kubernetes and OpenShift with 40+ tools for Tekton pipeline debugging, log analysis, root cause analysis, and predictive monitoring.
cloud-platformskubernetesmonitoringai
Kubefwd
Kubernetes bulk port forwarding with service discovery, /etc/hosts management, traffic monitoring, and pod log streaming
cloud-platformskubernetesmonitoring
Netops
Comprehensive DevOps and networking MCP server providing standardized access to essential infrastructure tools. Features network monitoring, system diagnostics, automation workflows, and infrastructure management with AI-powered operational insights.
coding-agentsmonitoringautomationai
Console Automation
Production-ready MCP server for AI-driven console automation and monitoring. 40 tools for session management, SSH, testing, monitoring, and background jobs. Like Playwright for terminal applications.
coding-agentstestingmonitoringautomation+1
Cockroachdb
A Model Context Protocol server for managing, monitoring, and querying data in [CockroachDB](https://cockroachlabs.com).
databasesmonitoring
Prometheus
Query and analyze Prometheus, open-source monitoring system.
databasesmonitoring
Jenkins
Enterprise-grade Jenkins CI/CD integration with multi-tier caching, pipeline monitoring, artifact management, and batch operations. Features 21 MCP tools for job management, build status tracking, and queue management with CSRF protection and 2FA support.
developer-toolsmonitoring
Opik
Use natural language to explore LLM observability, traces, and monitoring data captured by Opik.
developer-toolsmonitoringllm
Deploy
Universal deployment tracker for AI assistants with live status badges and deployment monitoring
developer-toolsdeploymentmonitoringai
Loopsense
MCP server that closes the feedback loop for AI coding agents — CI monitoring, process watching, file changes, HTTP polling.
developer-toolsmonitoringaiagent
Token Optimizer
Intelligent token optimization achieving 95%+ reduction through caching, compression, and 80+ smart tools for API optimization, code analysis, and real-time monitoring.
developer-toolsmonitoringapi
Wildfire
MCP server for detecting, monitoring, and analyzing potential wildfires globally using multiple data sources including NASA FIRMS, OpenWeatherMap, and Google Earth Engine.
environment-naturegomonitoring
Heurist Mesh
Access specialized web3 AI agents for blockchain analysis, smart contract security auditing, token metrics evaluation, and on-chain interactions through the Heurist Mesh network. Provides comprehensive tools for DeFi analysis, NFT valuation, and transaction monitoring across mul…
finance-fintechsecuritymonitoringai+1
Xrpl
MCP server for the XRP Ledger that provides access to account information, transaction history, and network data. Allows querying ledger objects, submitting transactions, and monitoring the XRPL network.
finance-fintechmonitoring
Context First
Session memory, context health monitoring, reasoning quality, and truthfulness verification MCP server with 37 tools and tiered memory storage. `npx -y context-first-mcp`
knowledge-memorymonitoringrag
Obsidian
SSE-enabled MCP server for remote Obsidian vault management with 29 tools for notes, directories, frontmatter, tags, search, and link operations. Docker-ready with health monitoring.
knowledge-memorydockermonitoring
Alog
AI agent activity logger & monitor MCP server with 20 tools. Post logs, create articles, manage social interactions, and monitor AI agent activities on the Alog platform.
monitoringaiagent
Langfuse
Query Langfuse traces, debug exceptions, analyze sessions, and manage prompts. Full observability toolkit for LLM applications.
monitoringllm
Dynatrace
Leverage AI-driven observability, security, and automation to analyze anomalies, logs, traces, events, metrics.
monitoringsecurityautomationai+1
Edgedelta
Interact with Edge Delta anomalies, query logs / patterns / events, and pinpoint root causes and optimize your pipelines.
monitoring
Agent Breadcrumbs
Unified agent work logging and observability across ChatGPT, Claude, Cursor, Codex, and OpenClaw with config-first schemas and pluggable sinks.
monitoringagent
Sentry
Sentry.io integration for error tracking and performance monitoring
monitoringperformance
Grafana
Search dashboards, investigate incidents and query datasources in your Grafana instance
monitoring
Lucidity
Enhance AI-generated code quality through intelligent, prompt-based analysis across 10 critical dimensions from complexity to security vulnerabilities
monitoringsecurityai
Status Observer
Model Context Protocol server for monitoring Operational Status of major digital platforms in Claude Desktop.
monitoring
Ig
Debug your Container and Kubernetes workloads with an AI interface powered by eBPF.
monitoringkubernetesai
Internet Speed Test
Internet speed testing with network performance metrics including download/upload speed, latency, jitter analysis, and CDN server detection with geographic mapping
monitoringtestingperformance
Last9
Seamlessly bring real-time production context—logs, metrics, and traces—into your local environment to auto-fix code faster
monitoring
Metoro
Query and interact with kubernetes environments monitored by Metoro
monitoringkubernetes
Server Raygun
Raygun API V3 integration for crash reporting and real user monitoring
monitoringapi
Zabbix
Zabbix integration for hosts, items, triggers, templates, problems, data and more.
monitoring
Netdata
Discovery, exploration, reporting and root cause analysis using all observability data, including metrics, logs, systems, containers, processes, and network connections
monitoringai
Logfire
Provides access to OpenTelemetry traces and metrics through Logfire
monitoring
Homebutler
All-in-one homelab management MCP server. Monitor system resources, manage Docker containers, Wake-on-LAN, scan networks, check open ports, and run alerts — across multiple servers via SSH. Single 10MB binary, zero dependencies.
monitoringdockerai
Monitor
A system monitoring tool that exposes system metrics via the Model Context Protocol (MCP). This tool allows LLMs to retrieve real-time system information through an MCP-compatible interface.(support CPU、Memory、Disk、Network、Host、Process)
monitoringllm
Freshprobe
Data freshness verification for AI agents. Probes endpoints for HTTP cache staleness, latency percentiles, content fingerprinting, TLS health, DNS timing, and redirect chains. Returns deterministic FRESH/STALE/UNKNOWN verdicts with NIST AI RMF mapping. CLI + MCP server + HTTP AP…
monitoringapiaiagent
Datadog
MCP server providing comprehensive Datadog observability access for AI assistants. Features grep-like log search, APM trace filtering with duration/status/error queries, smart sampling modes for token efficiency, and cross-correlation between logs, traces, and metrics.
monitoringai
Grafana Loki
An MCP server that allows querying Loki logs through the Grafana API.
monitoringapi
Sentry Mcp Rs
Fast and minimal Sentry MCP server written in Rust
monitoringrust
Victoriametrics
Provides comprehensive integration with your [VictoriaMetrics instance APIs](https://docs.victoriametrics.com/victoriametrics/url-examples/) and [documentation](https://docs.victoriametrics.com/) for monitoring, observability, and debugging tasks related to your VictoriaMetrics …
monitoringapi
Pmcp
A Prometheus Model Context Protocol Server.
monitoring
Cybersec Watchdog
Comprehensive Linux server security audit with 89 CIS Benchmark controls, NIST 800-53, and PCI-DSS compliance checks. Real-time monitoring with anomaly detection across 23 analyzers: firewall, SSH, fail2ban, Docker, CVE, rootkit, SSL/TLS, filesystem, network, and more.
securitydockermonitoringai
Ciphertrust Manager
MCP server for Thales CipherTrust Manager integration, enabling secure key management, cryptographic operations, and compliance monitoring through AI assistants.
securityrustmonitoringai
Thales Cdsp Cakm
MCP server for Thales CDSP CAKM integration, enabling secure key management, cryptographic operations, and compliance monitoring through AI assistants for Ms SQL and Oracle Databases.
securitymonitoringai
Meltbook
MCP server for meltbook, an AI-agent political discussion board. 50 AI agents autonomously post, vote, and debate Japanese politics. 11 tools for thread creation, posting, voting, and monitoring.
social-mediamonitoringaiagent
Twitter
All-in-one Twitter management solution providing timeline access, user tweet retrieval, hashtag monitoring, conversation analysis, direct messaging, sentiment analysis of a post, and complete post lifecycle control - all through a streamlined API.
social-mediamonitoringapi
Screenmonitormcp
Real-time screen analysis, context-aware recording, and UI monitoring MCP server. Supports AI vision, event hooks, and multimodal agent workflows.
other-tools-and-integrationsmonitoringaiagent
Metorial
🎖️ 📇 ☁️ Connect AI agents to 600+ integrations with a single interface - OAuth, scaling, and monitoring included
other-tools-and-integrationsmonitoringaiagent
Plugwise
TypeScript-based smart home automation server for Plugwise devices with automatic network discovery. Features comprehensive device control for thermostats, switches, smart plugs, energy monitoring, multi-hub management, and real-time climate/power consumption tracking via local …
other-tools-and-integrationstypescriptmonitoringautomation