Specialist Debugger

A senior debugging specialist with expertise in diagnosing complex software issues through systematic analysis, breakpoint strategies, and root cause identification across all layers of the stack.

When to Use This Agent

Choose Specialist Debugger when:

Tracking down intermittent bugs that are hard to reproduce
Diagnosing performance regressions or memory leaks
Debugging distributed system failures across multiple services
Analyzing crash reports, stack traces, and core dumps
Investigating race conditions and concurrency bugs

Consider alternatives when:

Writing new tests for existing code (use a test engineering agent)
Fixing known issues with clear root causes (just fix it directly)
Optimizing already-working code for performance (use a performance agent)

Quick Start


# .claude/agents/specialist-debugger.yml
name: Specialist Debugger
description: Diagnose and resolve complex software bugs
model: claude-sonnet
tools:
  - Read
  - Bash
  - Glob
  - Grep
  - Edit

Example invocation:

claude "Debug why the checkout flow intermittently fails with a 'payment already processed' error — it happens roughly once per 100 transactions"

Core Concepts

Debugging Methodology

Phase	Action	Tools
1. Reproduce	Create reliable reproduction steps	Logs, test scripts
2. Isolate	Narrow the scope to specific components	Binary search, bisect
3. Inspect	Examine state at the failure point	Debugger, logging, traces
4. Hypothesize	Form a theory about the root cause	Code reading, analysis
5. Verify	Confirm the hypothesis with targeted tests	Unit tests, assertions
6. Fix	Apply the minimal correct fix	Code change + test
7. Prevent	Add guards against recurrence	Regression tests, monitoring

Systematic Isolation Technique


# Git bisect to find the commit that introduced a bug
git bisect start
git bisect bad                    # Current commit has the bug
git bisect good v2.3.0            # This version was working
# Git checks out a middle commit — test it
npm test -- --grep "checkout"
git bisect good                   # This commit works
# Repeat until the culprit commit is found

# Binary search through code paths
# 1. Add logging at the entry and exit of each function in the flow
# 2. Identify where state diverges from expected
# 3. Narrow to the specific function
# 4. Add logging within that function
# 5. Identify the exact line where behavior diverges

Common Bug Patterns


// Race condition: two concurrent requests process the same payment
// Bug: No idempotency check before processing
async function processPayment(orderId: string) {
  const order = await db.orders.findById(orderId);  // Both requests read
  if (order.status === 'pending') {                  // Both pass this check
    await stripe.charge(order.amount);               // Both charge
    await db.orders.update(orderId, { status: 'paid' }); // Both update
  }
}

// Fix: Use database-level locking
async function processPayment(orderId: string) {
  await db.transaction(async (tx) => {
    const order = await tx.orders.findById(orderId, { forUpdate: true });
    if (order.status !== 'pending') {
      throw new Error('Payment already processed');
    }
    await stripe.charge(order.amount);
    await tx.orders.update(orderId, { status: 'paid' });
  });
}

Configuration

Parameter	Description	Default
`debug_strategy`	Primary approach (systematic, hypothesis-driven, trace)	`systematic`
`log_level`	Logging verbosity during investigation	`debug`
`trace_depth`	How many stack frames to analyze	`10`
`include_repro`	Generate reproduction scripts	`true`
`isolation_method`	Isolation technique (bisect, binary-search, diff)	`binary-search`
`fix_scope`	Fix approach (minimal, comprehensive, preventive)	`minimal`

Best Practices

Reproduce before investigating. A bug that cannot be reproduced cannot be verified as fixed. Invest time creating a reliable reproduction case — a test, a script, or documented steps. If the bug is intermittent, identify the conditions that make it more likely (load, timing, specific data). A good reproduction script is worth hours of code reading and guesswork.
Read the error message and stack trace carefully before diving into code. The stack trace often points directly to the problem. Read it bottom-up: the topmost frames show where the error was thrown, the middle frames show the call chain, and the bottom frames show the entry point. Many debugging sessions are wasted searching for root causes that the stack trace already revealed.
Use binary search to isolate problems efficiently. When the failure could be in any of 20 functions in a call chain, do not read all 20. Add a log statement in the middle, check if the state is correct at that point. If yes, the bug is in the later half. If no, the earlier half. Three to four iterations narrow 20 functions down to one. This technique also applies to git bisect for finding the breaking commit.
Check the boundaries first: inputs, outputs, and state transitions. Most bugs occur at boundaries — between services, between layers, between functions. Check what data enters a function, what data leaves it, and whether any state transitions are invalid. Validate that the data contract between caller and callee matches. Boundary mismatches (null where non-null expected, wrong date format, missing fields) cause the majority of bugs.
Write a regression test before fixing the bug. Write a test that fails with the current code and passes with the fix. This prevents the same bug from recurring and documents the exact failure condition. The test also serves as verification that your fix is correct. If you cannot write a failing test, you may not fully understand the bug yet.

Common Issues

Intermittent bugs disappear when debugging instrumentation is added. Adding log statements, breakpoints, or debugger attachment changes timing, memory layout, or execution order, causing race conditions and timing bugs to hide. Use non-intrusive observation tools: distributed tracing (OpenTelemetry), production logging at debug level (enabled via feature flag), or record-and-replay debugging. These observe without modifying the behavior being debugged.

Fix addresses the symptom but not the root cause. Adding a null check prevents the crash but does not explain why the value was null in the first place. Adding a retry masks the underlying connection issue. Before implementing a fix, ask "why did this happen?" at least three times (the "five whys" technique). A symptom fix leads to related bugs surfacing later. A root cause fix prevents an entire class of similar issues.

Debugging sessions consume hours without progress. When you have been investigating for more than 30 minutes without narrowing the scope, step back and reassess your approach. Describe the problem out loud (rubber duck debugging). Switch from code reading to data inspection. Ask a colleague for a fresh perspective. Often, the assumption driving your investigation is wrong, and a fresh viewpoint identifies the flawed assumption quickly.

⚠️ Loading Issue

Specialist Debugger

Specialist Debugger

When to Use This Agent

Quick Start

Core Concepts

Debugging Methodology

Systematic Isolation Technique

Common Bug Patterns

Configuration

Best Practices

Common Issues

Reviews

Write a review

Similar Templates

API Endpoint Builder

Documentation Auto-Generator

Ai Ethics Advisor Partner