Easy Troubleshooting Executor
Boost productivity using this generate, systematic, troubleshooting, documentation. Includes structured workflows, validation checks, and reusable patterns for documentation.
Easy Troubleshooting Executor
Systematically diagnose and resolve application issues using guided troubleshooting workflows and automated checks.
When to Use This Command
Run this command when you need to:
- Diagnose a production issue by running automated health checks across all system components
- Follow a structured troubleshooting workflow that checks common failure modes systematically
- Generate a diagnostic report with root cause analysis and recommended fixes
Consider alternatives when:
- You already know the root cause and just need to apply a specific fix
- The issue requires real-time debugging with breakpoints and interactive inspection
Quick Start
Configuration
name: easy-troubleshooting-executor type: command category: documentation
Example Invocation
claude command:run easy-troubleshooting-executor --symptom "api-timeout" --service backend
Example Output
Symptom: API requests timing out
Service: backend
Diagnostic Sequence:
[1/8] Network connectivity........PASS (latency: 2ms)
[2/8] DNS resolution..............PASS (resolved in 12ms)
[3/8] Service health endpoint.....PASS (200 OK, 45ms)
[4/8] Database connectivity.......FAIL (connection refused)
[5/8] Redis connectivity..........PASS (PONG in 1ms)
[6/8] Memory usage................WARN (78% utilized)
[7/8] CPU usage...................PASS (23% utilized)
[8/8] Disk space..................PASS (62% available)
Root Cause Identified:
Database server at db.internal:5432 is not accepting connections.
Connection pool exhausted due to max_connections limit reached.
Recommended Fix:
1. Increase max_connections from 100 to 200 in postgresql.conf
2. Restart PostgreSQL service
3. Verify application reconnects successfully
Workaround (immediate): Restart the backend service to reset connection pool.
Core Concepts
Troubleshooting System Overview
| Aspect | Details |
|---|---|
| Symptom Mapping | Maps user-reported symptoms to diagnostic check sequences |
| Health Probes | HTTP, TCP, DNS, and process-level health checks |
| Resource Monitoring | CPU, memory, disk, and network utilization assessment |
| Dependency Tracing | Tests connectivity to databases, caches, queues, and APIs |
| Root Cause Analysis | Correlates failures across checks to identify the root cause |
Diagnostic Workflow
Symptom Reported
|
v
+--------------------+
| Map to Check Suite |---> Timeout -> network + DB + resources
+--------------------+
|
v
+--------------------+
| Run Checks |---> Sequential diagnostic probes
+--------------------+
|
v
+--------------------+
| Correlate Failures |---> Which failures explain symptom?
+--------------------+
|
v
+--------------------+
| Identify Root Cause|---> DB down -> pool exhausted -> timeout
+--------------------+
|
v
+--------------------+
| Recommend Fix |---> Steps to resolve + workaround
+--------------------+
Configuration
| Parameter | Type | Default | Description |
|---|---|---|---|
| symptom | string | required | The observed problem: api-timeout, high-error-rate, slow-response, crash |
| service | string | all | Service to diagnose: backend, frontend, database, cache, all |
| depth | string | standard | Diagnostic depth: quick (critical only), standard, deep (all checks) |
| output | string | terminal | Output format: terminal, json, markdown |
| timeout_sec | integer | 10 | Timeout for each individual diagnostic check |
Best Practices
-
Start Broad Then Narrow - Run all diagnostic checks first to get a complete picture. Jumping to a suspected root cause without checking other components misses correlated failures.
-
Check Dependencies Bottom-Up - Verify infrastructure (network, DNS, disk) before application-level checks. An application timeout caused by a full disk is easy to miss if you only check application logs.
-
Save Diagnostic Reports - Store troubleshooting reports with timestamps so you can compare the current state to previous incidents. Patterns in diagnostic history reveal recurring issues that need permanent fixes.
-
Include Workarounds With Fixes - Always provide an immediate workaround alongside the proper fix. The workaround gets the service running while the team implements the real solution.
-
Automate Recurring Diagnostics - If the same symptom appears more than twice, automate the diagnostic checks into a monitoring alert. Proactive detection is always faster than reactive troubleshooting.
Common Issues
-
Diagnostic Check Hangs - A TCP check to an unresponsive host hangs indefinitely. Set strict timeouts on every check and treat a timeout as a diagnostic signal (the service is likely unreachable or overloaded).
-
False Root Cause Identification - A warning (78% memory) is flagged as root cause when the real issue is a database outage. Prioritize FAIL results over WARN results and correlate multiple failures before concluding.
-
Insufficient Permissions for Checks - The diagnostic tool cannot query system metrics or database stats due to missing permissions. Document required permissions and verify access before running deep diagnostics.
Reviews
No reviews yet. Be the first to review this template!
Similar Templates
Git Commit Message Generator
Generates well-structured conventional commit messages by analyzing staged changes. Follows Conventional Commits spec with scope detection.
React Component Scaffolder
Scaffolds a complete React component with TypeScript types, Tailwind styles, Storybook stories, and unit tests. Follows project conventions automatically.
CI/CD Pipeline Generator
Generates GitHub Actions workflows for CI/CD including linting, testing, building, and deploying. Detects project stack automatically.