C

Comprehensive Systematic Debugging

Production-ready skill that handles encountering, test, failure, unexpected. Includes structured workflows, validation checks, and reusable patterns for development.

SkillClipticsdevelopmentv1.0.0MIT
0 views0 copies

Comprehensive Systematic Debugging

A methodical skill for diagnosing and resolving software bugs through structured root cause analysis. Covers hypothesis-driven debugging, binary search techniques, logging strategies, and common bug patterns across different technology stacks.

When to Use This Skill

Choose this skill when:

  • Facing a bug that resists quick fixes and needs systematic investigation
  • Debugging issues that span multiple components or services
  • Dealing with intermittent bugs that don't reproduce consistently
  • Analyzing production issues from logs, traces, and metrics
  • Teaching or establishing debugging processes for a team

Consider alternatives when:

  • Writing tests to prevent bugs → use a testing patterns skill
  • Optimizing performance → use a performance profiling skill
  • Debugging a specific framework → use that framework's skill
  • Setting up monitoring → use a DevOps/observability skill

Quick Start

# Systematic Debugging Process ## Step 1: REPRODUCE - Can you trigger the bug reliably? - What are the exact steps to reproduce? - What environment (OS, browser, version)? - Does it happen in all environments? ## Step 2: CHARACTERIZE - When did it start? (git bisect) - What changed recently? (commits, configs, dependencies) - What are the exact error messages? - What's the expected vs actual behavior? ## Step 3: ISOLATE - Which component/layer is responsible? - Is it frontend, backend, database, or network? - Can you reproduce with minimal code? - Does it happen with hardcoded values? ## Step 4: HYPOTHESIZE - What are the possible causes? - Which is most likely given the evidence? - What experiment would confirm or refute? ## Step 5: FIX - Make the minimal change that fixes the issue - Verify the fix addresses the root cause - Add a regression test - Check for related bugs with the same root cause ## Step 6: PREVENT - Why wasn't this caught earlier? - What test/check would catch it? - Are there similar patterns in the codebase?

Core Concepts

Bug Pattern Recognition

SymptomLikely CauseInvestigation
Works locally, fails in prodEnv vars, secrets, configsDiff environment configs
Intermittent failuresRace conditions, timingAdd correlation logging
Slow degradation over timeMemory leak, connection leakProfile memory, check pool sizes
Works for some usersData-dependent, permissionsCompare user data and roles
Fails after deployNew code regressiongit bisect to find breaking commit
UI renders but data is wrongAPI/transform bugInspect network responses

Binary Search Debugging

// Git bisect to find the breaking commit // 1. Find a known good commit and the current bad commit // 2. Git bisect automates binary search // git bisect start // git bisect bad HEAD // git bisect good v2.1.0 // # Git checks out middle commit — test it // git bisect good # or git bisect bad // # Repeat until the first bad commit is found // Automated bisect with test script: // git bisect run npm test // Code-level binary search for intermittent bugs: function debugBinarySearch(data: any[]) { // Suspect: one item in `data` causes the crash const mid = Math.floor(data.length / 2); const firstHalf = data.slice(0, mid); const secondHalf = data.slice(mid); console.log('Testing first half:', firstHalf.length, 'items'); processData(firstHalf); // crashes? → bug is in first half console.log('Testing second half:', secondHalf.length, 'items'); processData(secondHalf); // crashes? → bug is in second half // Repeat with the crashing half until you find the single bad item }

Structured Logging for Debugging

// Add contextual logging for debugging production issues function debuggableRequest(handler: RequestHandler): RequestHandler { return async (req, res, next) => { const debugContext = { requestId: crypto.randomUUID(), userId: req.user?.id, method: req.method, path: req.path, timestamp: Date.now(), }; req.log = logger.child(debugContext); req.log.info('Request started'); const start = performance.now(); try { await handler(req, res, next); req.log.info({ status: res.statusCode, duration: (performance.now() - start).toFixed(2), }, 'Request completed'); } catch (err) { req.log.error({ error: err.message, stack: err.stack, duration: (performance.now() - start).toFixed(2), }, 'Request failed'); throw err; } }; }

Configuration

ParameterTypeDefaultDescription
loggingLevelstring'debug'Debugging log level: debug, trace, or verbose
bisectStrategystring'git-bisect'Bisect: git-bisect, code-bisect, or manual
reproductionTimeoutnumber300Max seconds to attempt reproduction
hypothesisLimitnumber5Max hypotheses to test before reassessing
stackTraceDepthnumber20Stack trace lines to capture
correlationIdHeaderstring'x-request-id'Header for request correlation

Best Practices

  1. Never fix without understanding the root cause — A fix that makes the symptom disappear without explaining why it works will eventually fail. If you can't explain the bug to a colleague, you haven't found the root cause yet.

  2. Reproduce first, hypothesize second — Until you can reliably trigger the bug, any hypothesis is guessing. Write a test case that demonstrates the failure. If you can't reproduce it, add logging and wait for it to happen again.

  3. Change one thing at a time when testing hypotheses — If you change three things and the bug disappears, you don't know which change fixed it. Revert two of the three changes and verify the fix still works.

  4. Use binary search to narrow the problem space — Whether it's git bisect for commits, commenting out half the code, or testing with half the data, binary search is the fastest way to isolate the responsible component.

  5. Document the bug, root cause, and fix — Future developers (including yourself) will encounter similar bugs. Document what the symptoms were, why the bug existed, how you found it, and how you fixed it. This turns debugging into organizational knowledge.

Common Issues

Bug disappears when adding debug logging — This is a classic Heisenbug caused by timing changes. The logging adds enough delay to avoid a race condition. Investigate thread synchronization, shared state access, and timing-dependent logic rather than relying on the logging "fix."

Stack trace points to framework code, not application code — Read the stack trace bottom-up to find where your code enters the framework. The bug is usually in the last application frame before the framework call. Check what arguments you passed to the framework function.

Fix works in development but not in production — Environment differences cause this: different database data, stricter security headers, different Node.js versions, missing environment variables, or different file permissions. Compare environment configurations systematically.

Community

Reviews

Write a review

No reviews yet. Be the first to review this template!

Similar Templates