Codebase Explorer Guru
A agent template for development tools workflows. Streamlines development with pre-configured patterns and best practices.
Codebase Explorer Guru
A codebase exploration specialist that rapidly builds a complete mental model of unfamiliar codebases and presents clear, actionable summaries through a structured six-phase discovery process.
When to Use This Agent
Choose Codebase Explorer Guru when:
- Onboarding to a new project and need to understand its structure quickly
- Evaluating an open-source library's architecture before adopting it
- Conducting a codebase audit for acquisition or technical due diligence
- Mapping dependencies, data flows, and integration points in an unfamiliar system
- Creating documentation for a codebase that has little or none
Consider alternatives when:
- Looking for a specific function or class (use grep/glob directly)
- Understanding one specific feature's implementation (use a code reader agent)
- Making changes to code (use a development agent after exploring)
Quick Start
# .claude/agents/codebase-explorer-guru.yml name: Codebase Explorer Guru description: Explore and document unfamiliar codebases model: claude-sonnet tools: - Read - Glob - Grep - Bash
Example invocation:
claude "Explore the codebase in src/ and create a comprehensive map of the architecture, key modules, data flow, and integration points"
Core Concepts
Six-Phase Discovery Process
| Phase | Focus | Key Actions |
|---|---|---|
| 1. Project Discovery | What is this? | Read package.json, README, configs |
| 2. Architecture Mapping | How is it structured? | Scan directories, identify layers |
| 3. Dependency Analysis | What does it depend on? | Parse lockfiles, map internal imports |
| 4. Data Flow Tracing | How does data move? | Follow request paths, trace state |
| 5. Integration Points | What does it connect to? | Find API calls, DB connections, queues |
| 6. Quality Assessment | How healthy is it? | Check tests, types, linting, patterns |
Phase 1: Project Discovery
# Files to read first (in priority order) 1. package.json / pyproject.toml / go.mod # Dependencies + scripts 2. README.md # Intent and setup 3. .env.example / .env.template # External dependencies 4. docker-compose.yml # Service topology 5. tsconfig.json / webpack.config.js # Build configuration 6. .github/workflows/*.yml # CI/CD pipeline
Architecture Map Output
## Architecture Summary: E-Commerce Platform ### Tech Stack - Runtime: Node.js 20 + TypeScript 5.3 - Framework: Next.js 14 (App Router) - Database: PostgreSQL 16 via Drizzle ORM - Cache: Redis 7 (sessions + rate limiting) - Queue: BullMQ (order processing, emails) - Auth: NextAuth.js v5 (Google, email) ### Directory Structure src/ app/ → Next.js App Router pages and layouts components/ → React components (154 files) ui/ → Shared UI primitives (Button, Modal, etc.) features/ → Feature-specific components lib/ → Core utilities and configurations db/ → Drizzle schema, migrations, queries auth/ → Authentication configuration stripe/ → Payment integration server/ → Server-side logic actions/ → Server actions (mutations) api/ → API route handlers types/ → TypeScript type definitions ### Data Flow: Order Placement User → Checkout Page → createOrder (server action) → Validate cart items (DB query) → Create Stripe PaymentIntent → Insert order record (DB) → Enqueue order.created event (BullMQ) → Worker: send confirmation email → Worker: update inventory counts
Configuration
| Parameter | Description | Default |
|---|---|---|
depth | Exploration depth (quick, standard, deep) | standard |
focus | Specific area to explore (backend, frontend, infra) | All |
output_format | Report format (markdown, json, diagram) | markdown |
include_metrics | Include code metrics (LOC, complexity, coverage) | true |
dependency_depth | How deep to trace dependencies | 2 |
ignore_patterns | Directories/files to skip | ["node_modules", "dist"] |
Best Practices
-
Follow the entry point chain to understand request flow. Start from the top-level entry point (main.ts, app.tsx, index.py) and trace how requests flow through middleware, routers, controllers, services, and data access layers. This reveals the actual architecture, which often differs from the directory structure's implied architecture. Document the flow as you trace it.
-
Map the dependency graph before reading implementation details. Use import/require statements to build a module dependency graph. Identify high-fan-in modules (many things depend on them — they are core abstractions) and high-fan-out modules (they depend on many things — they are orchestrators). This tells you where to start reading and which modules are most critical to understand.
-
Check test files to understand intended behavior. Test files often contain the clearest documentation of what a module does, what inputs it accepts, and what edge cases the developers considered. Read test files alongside source files, especially for complex business logic. The test descriptions serve as a specification document that is guaranteed to be up-to-date if the tests pass.
-
Look for patterns in the first three files of each directory. Most codebases have consistent internal patterns — how services are structured, how errors are handled, how data is validated. Read three representative files from a directory to extract the pattern, then skim the rest to verify consistency. This is far faster than reading every file and captures the same structural understanding.
-
Document as you explore rather than exploring first and documenting later. Write findings immediately into a structured document as you discover them. This forces you to articulate your understanding, reveals gaps in your mental model, and produces a deliverable artifact as a natural byproduct of exploration. Waiting until after exploration is complete often results in incomplete or disorganized documentation.
Common Issues
Getting lost in large codebases with thousands of files. Start with the build configuration and entry points, not the file tree. A package.json main or scripts.start field tells you where execution begins. Trace from there. Use file modification dates to identify actively developed areas versus legacy code. Ignore test files and generated code during the initial structural scan — they add volume without architectural insight.
Misidentifying the architecture pattern from directory names alone. A directory named controllers/ does not guarantee MVC architecture. A services/ directory might contain god classes that do everything. Verify patterns by reading the actual code. Check whether "controllers" truly only handle HTTP concerns and delegate to services, or whether they contain business logic. Report the actual architecture, not the aspired one.
Missing hidden dependencies and side effects. Not all dependencies are visible in import statements. Dynamic requires, dependency injection containers, environment variable lookups, and runtime plugins create invisible connections. Search for process.env, require() with variable arguments, and DI container registrations. Check initialization code and middleware chains for implicit dependencies that the import graph does not reveal.
Reviews
No reviews yet. Be the first to review this template!
Similar Templates
API Endpoint Builder
Agent that scaffolds complete REST API endpoints with controller, service, route, types, and tests. Supports Express, Fastify, and NestJS.
Documentation Auto-Generator
Agent that reads your codebase and generates comprehensive documentation including API docs, architecture guides, and setup instructions.
Ai Ethics Advisor Partner
All-in-one agent covering ethics, responsible, development, specialist. Includes structured workflows, validation checks, and reusable patterns for ai specialists.