Comprehensive Systematic Debugging
Production-ready skill that handles encountering, test, failure, unexpected. Includes structured workflows, validation checks, and reusable patterns for development.
Comprehensive Systematic Debugging
A methodical skill for diagnosing and resolving software bugs through structured root cause analysis. Covers hypothesis-driven debugging, binary search techniques, logging strategies, and common bug patterns across different technology stacks.
When to Use This Skill
Choose this skill when:
- Facing a bug that resists quick fixes and needs systematic investigation
- Debugging issues that span multiple components or services
- Dealing with intermittent bugs that don't reproduce consistently
- Analyzing production issues from logs, traces, and metrics
- Teaching or establishing debugging processes for a team
Consider alternatives when:
- Writing tests to prevent bugs → use a testing patterns skill
- Optimizing performance → use a performance profiling skill
- Debugging a specific framework → use that framework's skill
- Setting up monitoring → use a DevOps/observability skill
Quick Start
# Systematic Debugging Process ## Step 1: REPRODUCE - Can you trigger the bug reliably? - What are the exact steps to reproduce? - What environment (OS, browser, version)? - Does it happen in all environments? ## Step 2: CHARACTERIZE - When did it start? (git bisect) - What changed recently? (commits, configs, dependencies) - What are the exact error messages? - What's the expected vs actual behavior? ## Step 3: ISOLATE - Which component/layer is responsible? - Is it frontend, backend, database, or network? - Can you reproduce with minimal code? - Does it happen with hardcoded values? ## Step 4: HYPOTHESIZE - What are the possible causes? - Which is most likely given the evidence? - What experiment would confirm or refute? ## Step 5: FIX - Make the minimal change that fixes the issue - Verify the fix addresses the root cause - Add a regression test - Check for related bugs with the same root cause ## Step 6: PREVENT - Why wasn't this caught earlier? - What test/check would catch it? - Are there similar patterns in the codebase?
Core Concepts
Bug Pattern Recognition
| Symptom | Likely Cause | Investigation |
|---|---|---|
| Works locally, fails in prod | Env vars, secrets, configs | Diff environment configs |
| Intermittent failures | Race conditions, timing | Add correlation logging |
| Slow degradation over time | Memory leak, connection leak | Profile memory, check pool sizes |
| Works for some users | Data-dependent, permissions | Compare user data and roles |
| Fails after deploy | New code regression | git bisect to find breaking commit |
| UI renders but data is wrong | API/transform bug | Inspect network responses |
Binary Search Debugging
// Git bisect to find the breaking commit // 1. Find a known good commit and the current bad commit // 2. Git bisect automates binary search // git bisect start // git bisect bad HEAD // git bisect good v2.1.0 // # Git checks out middle commit — test it // git bisect good # or git bisect bad // # Repeat until the first bad commit is found // Automated bisect with test script: // git bisect run npm test // Code-level binary search for intermittent bugs: function debugBinarySearch(data: any[]) { // Suspect: one item in `data` causes the crash const mid = Math.floor(data.length / 2); const firstHalf = data.slice(0, mid); const secondHalf = data.slice(mid); console.log('Testing first half:', firstHalf.length, 'items'); processData(firstHalf); // crashes? → bug is in first half console.log('Testing second half:', secondHalf.length, 'items'); processData(secondHalf); // crashes? → bug is in second half // Repeat with the crashing half until you find the single bad item }
Structured Logging for Debugging
// Add contextual logging for debugging production issues function debuggableRequest(handler: RequestHandler): RequestHandler { return async (req, res, next) => { const debugContext = { requestId: crypto.randomUUID(), userId: req.user?.id, method: req.method, path: req.path, timestamp: Date.now(), }; req.log = logger.child(debugContext); req.log.info('Request started'); const start = performance.now(); try { await handler(req, res, next); req.log.info({ status: res.statusCode, duration: (performance.now() - start).toFixed(2), }, 'Request completed'); } catch (err) { req.log.error({ error: err.message, stack: err.stack, duration: (performance.now() - start).toFixed(2), }, 'Request failed'); throw err; } }; }
Configuration
| Parameter | Type | Default | Description |
|---|---|---|---|
loggingLevel | string | 'debug' | Debugging log level: debug, trace, or verbose |
bisectStrategy | string | 'git-bisect' | Bisect: git-bisect, code-bisect, or manual |
reproductionTimeout | number | 300 | Max seconds to attempt reproduction |
hypothesisLimit | number | 5 | Max hypotheses to test before reassessing |
stackTraceDepth | number | 20 | Stack trace lines to capture |
correlationIdHeader | string | 'x-request-id' | Header for request correlation |
Best Practices
-
Never fix without understanding the root cause — A fix that makes the symptom disappear without explaining why it works will eventually fail. If you can't explain the bug to a colleague, you haven't found the root cause yet.
-
Reproduce first, hypothesize second — Until you can reliably trigger the bug, any hypothesis is guessing. Write a test case that demonstrates the failure. If you can't reproduce it, add logging and wait for it to happen again.
-
Change one thing at a time when testing hypotheses — If you change three things and the bug disappears, you don't know which change fixed it. Revert two of the three changes and verify the fix still works.
-
Use binary search to narrow the problem space — Whether it's git bisect for commits, commenting out half the code, or testing with half the data, binary search is the fastest way to isolate the responsible component.
-
Document the bug, root cause, and fix — Future developers (including yourself) will encounter similar bugs. Document what the symptoms were, why the bug existed, how you found it, and how you fixed it. This turns debugging into organizational knowledge.
Common Issues
Bug disappears when adding debug logging — This is a classic Heisenbug caused by timing changes. The logging adds enough delay to avoid a race condition. Investigate thread synchronization, shared state access, and timing-dependent logic rather than relying on the logging "fix."
Stack trace points to framework code, not application code — Read the stack trace bottom-up to find where your code enters the framework. The bug is usually in the last application frame before the framework call. Check what arguments you passed to the framework function.
Fix works in development but not in production — Environment differences cause this: different database data, stricter security headers, different Node.js versions, missing environment variables, or different file permissions. Compare environment configurations systematically.
Reviews
No reviews yet. Be the first to review this template!
Similar Templates
Full-Stack Code Reviewer
Comprehensive code review skill that checks for security vulnerabilities, performance issues, accessibility, and best practices across frontend and backend code.
Test Suite Generator
Generates comprehensive test suites with unit tests, integration tests, and edge cases. Supports Jest, Vitest, Pytest, and Go testing.
Pro Architecture Workspace
Battle-tested skill for architectural, decision, making, framework. Includes structured workflows, validation checks, and reusable patterns for development.