Deployment Monitoring Fast
Production-ready command that handles comprehensive, deployment, monitoring, observability. Includes structured workflows, validation checks, and reusable patterns for deployment.
Deployment Monitoring Fast
Rapidly configure real-time monitoring and alerting for application deployments.
When to Use This Command
Run this command when you need to:
- Set up health checks, metrics collection, and alerting for a freshly deployed service
- Create monitoring dashboards that track deployment success and application performance
- Configure automated rollback triggers based on error rate or latency thresholds
Consider alternatives when:
- You already have a mature observability stack and only need to add a single metric
- Your monitoring needs are limited to simple uptime pings without dashboards
Quick Start
Configuration
name: deployment-monitoring-fast type: command category: deployment
Example Invocation
claude command:run deployment-monitoring-fast --stack prometheus --app api-server
Example Output
Scanning deployment: api-server
Detected endpoints: /health, /metrics, /ready
Infrastructure: Kubernetes (namespace: production)
Configured monitoring:
[+] Health check probe: /health (interval: 10s)
[+] Prometheus scrape target: /metrics (port 9090)
[+] Grafana dashboard: api-server-overview.json
[+] Alert rules: 4 rules created
- HighErrorRate (>5% 5xx in 5min)
- HighLatency (p99 > 2s for 5min)
- PodRestarts (>3 in 15min)
- MemoryPressure (>85% for 10min)
Status: Monitoring active. Dashboard URL: http://grafana:3000/d/api-server
Core Concepts
Monitoring Stack Overview
| Aspect | Details |
|---|---|
| Metrics Collection | Prometheus, Datadog, CloudWatch, or StatsD scraping |
| Visualization | Grafana dashboards with pre-built deployment panels |
| Alerting Channels | Slack, PagerDuty, email, or webhook integrations |
| Health Probes | HTTP, TCP, and gRPC liveness and readiness checks |
| SLO Tracking | Error budget burn rate and availability percentage |
Monitoring Pipeline Workflow
Application
|
v
/metrics endpoint
|
v
+------------------+
| Prometheus |---> Scrape & Store
+------------------+
| |
v v
+-------+ +--------+
|Grafana| |Alertmgr|
+-------+ +--------+
| |
v v
Dashboard Slack/PagerDuty
Configuration
| Parameter | Type | Default | Description |
|---|---|---|---|
| stack | string | prometheus | Monitoring stack: prometheus, datadog, cloudwatch |
| app | string | required | Application or service name to monitor |
| namespace | string | default | Kubernetes namespace or environment identifier |
| alert_channel | string | slack | Where to send alerts: slack, pagerduty, email, webhook |
| scrape_interval | string | 15s | How frequently to collect metrics from the application |
Best Practices
-
Define SLOs Before Alerts - Establish service level objectives for latency and error rate first. Derive alert thresholds from SLO burn rates to avoid noisy, meaningless notifications.
-
Use Multi-Signal Detection - Combine error rate, latency, and saturation metrics in alert rules. A single metric can produce false positives, while correlated signals provide high confidence.
-
Separate Deployment Dashboards - Create a dedicated dashboard for deployment events overlaid with key metrics. This makes it immediately visible whether a new release caused a regression.
-
Set Meaningful Severity Levels - Reserve critical/page-worthy alerts for customer-facing outages. Use warning-level alerts for degradation that can wait until business hours.
-
Automate Runbook Links - Attach troubleshooting runbook URLs to every alert rule. When an alert fires at 3 AM, the on-call engineer needs actionable steps, not just a metric name.
Common Issues
-
Alert Fatigue From Noisy Rules - Overly sensitive thresholds fire constantly. Tune alert windows (use 5-minute averages instead of instant values) and add for-duration clauses to suppress transient spikes.
-
Metrics Endpoint Not Scraped - Prometheus cannot reach the /metrics path. Verify the service has the correct port annotation, network policy allows scraper traffic, and the endpoint returns 200.
-
Dashboard Shows No Data - The metric name in the query does not match what the application exposes. Use the Prometheus expression browser to verify available metric names before building panels.
Reviews
No reviews yet. Be the first to review this template!
Similar Templates
Git Commit Message Generator
Generates well-structured conventional commit messages by analyzing staged changes. Follows Conventional Commits spec with scope detection.
React Component Scaffolder
Scaffolds a complete React component with TypeScript types, Tailwind styles, Storybook stories, and unit tests. Follows project conventions automatically.
CI/CD Pipeline Generator
Generates GitHub Actions workflows for CI/CD including linting, testing, building, and deploying. Detects project stack automatically.