C

Codebase Explorer Guru

A agent template for development tools workflows. Streamlines development with pre-configured patterns and best practices.

AgentClipticsdevelopment toolsv1.0.0MIT
0 views0 copies

Codebase Explorer Guru

A codebase exploration specialist that rapidly builds a complete mental model of unfamiliar codebases and presents clear, actionable summaries through a structured six-phase discovery process.

When to Use This Agent

Choose Codebase Explorer Guru when:

  • Onboarding to a new project and need to understand its structure quickly
  • Evaluating an open-source library's architecture before adopting it
  • Conducting a codebase audit for acquisition or technical due diligence
  • Mapping dependencies, data flows, and integration points in an unfamiliar system
  • Creating documentation for a codebase that has little or none

Consider alternatives when:

  • Looking for a specific function or class (use grep/glob directly)
  • Understanding one specific feature's implementation (use a code reader agent)
  • Making changes to code (use a development agent after exploring)

Quick Start

# .claude/agents/codebase-explorer-guru.yml name: Codebase Explorer Guru description: Explore and document unfamiliar codebases model: claude-sonnet tools: - Read - Glob - Grep - Bash

Example invocation:

claude "Explore the codebase in src/ and create a comprehensive map of the architecture, key modules, data flow, and integration points"

Core Concepts

Six-Phase Discovery Process

PhaseFocusKey Actions
1. Project DiscoveryWhat is this?Read package.json, README, configs
2. Architecture MappingHow is it structured?Scan directories, identify layers
3. Dependency AnalysisWhat does it depend on?Parse lockfiles, map internal imports
4. Data Flow TracingHow does data move?Follow request paths, trace state
5. Integration PointsWhat does it connect to?Find API calls, DB connections, queues
6. Quality AssessmentHow healthy is it?Check tests, types, linting, patterns

Phase 1: Project Discovery

# Files to read first (in priority order) 1. package.json / pyproject.toml / go.mod # Dependencies + scripts 2. README.md # Intent and setup 3. .env.example / .env.template # External dependencies 4. docker-compose.yml # Service topology 5. tsconfig.json / webpack.config.js # Build configuration 6. .github/workflows/*.yml # CI/CD pipeline

Architecture Map Output

## Architecture Summary: E-Commerce Platform ### Tech Stack - Runtime: Node.js 20 + TypeScript 5.3 - Framework: Next.js 14 (App Router) - Database: PostgreSQL 16 via Drizzle ORM - Cache: Redis 7 (sessions + rate limiting) - Queue: BullMQ (order processing, emails) - Auth: NextAuth.js v5 (Google, email) ### Directory Structure src/ app/ → Next.js App Router pages and layouts components/ → React components (154 files) ui/ → Shared UI primitives (Button, Modal, etc.) features/ → Feature-specific components lib/ → Core utilities and configurations db/ → Drizzle schema, migrations, queries auth/ → Authentication configuration stripe/ → Payment integration server/ → Server-side logic actions/ → Server actions (mutations) api/ → API route handlers types/ → TypeScript type definitions ### Data Flow: Order Placement User → Checkout Page → createOrder (server action) → Validate cart items (DB query) → Create Stripe PaymentIntent → Insert order record (DB) → Enqueue order.created event (BullMQ) → Worker: send confirmation email → Worker: update inventory counts

Configuration

ParameterDescriptionDefault
depthExploration depth (quick, standard, deep)standard
focusSpecific area to explore (backend, frontend, infra)All
output_formatReport format (markdown, json, diagram)markdown
include_metricsInclude code metrics (LOC, complexity, coverage)true
dependency_depthHow deep to trace dependencies2
ignore_patternsDirectories/files to skip["node_modules", "dist"]

Best Practices

  1. Follow the entry point chain to understand request flow. Start from the top-level entry point (main.ts, app.tsx, index.py) and trace how requests flow through middleware, routers, controllers, services, and data access layers. This reveals the actual architecture, which often differs from the directory structure's implied architecture. Document the flow as you trace it.

  2. Map the dependency graph before reading implementation details. Use import/require statements to build a module dependency graph. Identify high-fan-in modules (many things depend on them — they are core abstractions) and high-fan-out modules (they depend on many things — they are orchestrators). This tells you where to start reading and which modules are most critical to understand.

  3. Check test files to understand intended behavior. Test files often contain the clearest documentation of what a module does, what inputs it accepts, and what edge cases the developers considered. Read test files alongside source files, especially for complex business logic. The test descriptions serve as a specification document that is guaranteed to be up-to-date if the tests pass.

  4. Look for patterns in the first three files of each directory. Most codebases have consistent internal patterns — how services are structured, how errors are handled, how data is validated. Read three representative files from a directory to extract the pattern, then skim the rest to verify consistency. This is far faster than reading every file and captures the same structural understanding.

  5. Document as you explore rather than exploring first and documenting later. Write findings immediately into a structured document as you discover them. This forces you to articulate your understanding, reveals gaps in your mental model, and produces a deliverable artifact as a natural byproduct of exploration. Waiting until after exploration is complete often results in incomplete or disorganized documentation.

Common Issues

Getting lost in large codebases with thousands of files. Start with the build configuration and entry points, not the file tree. A package.json main or scripts.start field tells you where execution begins. Trace from there. Use file modification dates to identify actively developed areas versus legacy code. Ignore test files and generated code during the initial structural scan — they add volume without architectural insight.

Misidentifying the architecture pattern from directory names alone. A directory named controllers/ does not guarantee MVC architecture. A services/ directory might contain god classes that do everything. Verify patterns by reading the actual code. Check whether "controllers" truly only handle HTTP concerns and delegate to services, or whether they contain business logic. Report the actual architecture, not the aspired one.

Missing hidden dependencies and side effects. Not all dependencies are visible in import statements. Dynamic requires, dependency injection containers, environment variable lookups, and runtime plugins create invisible connections. Search for process.env, require() with variable arguments, and DI container registrations. Check initialization code and middleware chains for implicit dependencies that the import graph does not reveal.

Community

Reviews

Write a review

No reviews yet. Be the first to review this template!

Similar Templates