Postgresql Dba Strategist
Streamline your workflow with this work, postgresql, databases, using. Includes structured workflows, validation checks, and reusable patterns for data ai.
PostgreSQL DBA Strategist
An expert agent for PostgreSQL database administration covering query optimization, index design, replication, partitioning, vacuum tuning, and production troubleshooting for high-performance PostgreSQL deployments.
When to Use This Agent
Choose PostgreSQL DBA Strategist when:
- Optimizing PostgreSQL query performance with EXPLAIN ANALYZE
- Designing partitioning strategies for large tables
- Configuring streaming replication and failover
- Tuning autovacuum for high-write workloads
- Troubleshooting connection issues, bloat, and lock contention
Consider alternatives when:
- Working with SQL Server or MySQL (use database-specific agents)
- Using Neon serverless PostgreSQL specifically (use a Neon agent)
- Designing application data models without DBA concerns (use a backend dev agent)
Quick Start
# .claude/agents/postgresql-dba-strategist.yml name: PostgreSQL DBA model: claude-sonnet-4-20250514 tools: - Read - Write - Bash - Glob - Grep prompt: | You are a PostgreSQL DBA expert. Optimize queries, design indexes, configure replication, tune autovacuum, and troubleshoot production issues. Always use EXPLAIN ANALYZE for diagnosis and verify changes don't degrade other queries.
Example invocation:
claude --agent postgresql-dba-strategist "Our orders table has 500M rows and sequential scans are killing performance. Design a partitioning strategy by order_date and migrate the existing data with minimal downtime."
Core Concepts
Query Optimization Workflow
-- Step 1: Capture the execution plan EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) SELECT * FROM orders WHERE created_at > '2024-01-01' AND status = 'pending'; -- Key metrics to examine: -- Planning Time vs Execution Time -- Seq Scan vs Index Scan (look for unexpected Seq Scans) -- Rows estimated vs actual (large gaps = stale statistics) -- Buffers shared hit vs read (cache effectiveness)
Index Design Guidelines
| Query Pattern | Index Type | Example |
|---|---|---|
| Equality filter | B-tree | CREATE INDEX ON orders (status) |
| Range filter | B-tree (column order matters) | CREATE INDEX ON orders (created_at) |
| Pattern matching | GIN trigram | CREATE INDEX ON users USING GIN (name gin_trgm_ops) |
| JSON queries | GIN | CREATE INDEX ON events USING GIN (data jsonb_path_ops) |
| Full-text search | GIN tsvector | CREATE INDEX ON docs USING GIN (to_tsvector('english', body)) |
| Geospatial | GiST | CREATE INDEX ON places USING GIST (location) |
| Partial data | B-tree filtered | CREATE INDEX ON orders (id) WHERE status = 'pending' |
Partitioning Strategy
-- Range partitioning by date CREATE TABLE orders ( id bigint GENERATED ALWAYS AS IDENTITY, created_at timestamptz NOT NULL, customer_id bigint, total numeric(10,2) ) PARTITION BY RANGE (created_at); CREATE TABLE orders_2024_q1 PARTITION OF orders FOR VALUES FROM ('2024-01-01') TO ('2024-04-01'); CREATE TABLE orders_2024_q2 PARTITION OF orders FOR VALUES FROM ('2024-04-01') TO ('2024-07-01'); -- Automate future partition creation
Configuration
| Parameter | Description | Default |
|---|---|---|
shared_buffers | Shared memory for caching | 25% of RAM |
effective_cache_size | OS cache estimate for planner | 75% of RAM |
work_mem | Memory per sort/hash operation | 64MB |
maintenance_work_mem | Memory for VACUUM, CREATE INDEX | 512MB |
wal_level | Write-ahead log detail | replica |
max_connections | Maximum client connections | 200 |
autovacuum_naptime | Autovacuum check interval | 30s |
Best Practices
-
Always use EXPLAIN (ANALYZE, BUFFERS) for diagnosis, never guess. The execution plan reveals exactly what PostgreSQL does: which indexes it uses, how many rows it expects versus reality, and whether data comes from shared buffers or disk. A large gap between estimated and actual rows indicates stale statistics that need
ANALYZE. Without this data, performance tuning is guesswork. -
Tune autovacuum per table, not globally. High-write tables need more aggressive vacuuming than read-heavy tables. Set
autovacuum_vacuum_scale_factorto 0.01 (1%) for large tables instead of the default 0.2 (20%). A 500M-row table with default settings won't autovacuum until 100M dead tuples accumulate, causing massive bloat. Table-level settings let you vacuum frequently on hot tables without wasting resources on cold ones. -
Design indexes for your most common query patterns, not all patterns. Each index adds write overhead and consumes storage. Profile your actual query workload using
pg_stat_statementsto find the top 10 queries by total time. Index those patterns first. A covering index that handles your top 3 queries is better than 10 single-column indexes that each help one query. -
Use connection pooling in production, always. PostgreSQL forks a process per connection, so 500 idle connections waste significant memory. Use PgBouncer in transaction mode between your application and PostgreSQL. This lets thousands of application connections share a pool of 50-100 actual database connections. Set
max_connectionsbased on the pooler's pool size plus administrative connections. -
Partition tables before they become too large to manage. Adding partitioning to an existing 500M-row table requires data migration and downtime. Plan partitioning when tables are expected to exceed 50-100M rows. Range partitioning by date is the most common and works well for time-series data. Ensure queries include the partition key in WHERE clauses so the planner can eliminate irrelevant partitions.
Common Issues
Query suddenly slows down after working fine for months. Check for plan changes caused by stale statistics (ANALYZE the table), index bloat (rebuild with REINDEX CONCURRENTLY), or table bloat (check pg_stat_user_tables for dead tuple ratio). Also check if data volume crossed a tipping point where the planner switches from index scan to sequential scan. Force statistics update and compare execution plans before and after.
High CPU usage from autovacuum on busy tables. Autovacuum running on large tables consumes CPU and I/O, potentially affecting production queries. Tune autovacuum_vacuum_cost_delay to throttle vacuuming (increase from 2ms to 10-20ms for less aggressive behavior). Schedule manual VACUUM during off-peak hours for the largest tables. Set autovacuum_max_workers based on your CPU cores and I/O capacity.
Replication lag increases under write-heavy workloads. The replica can't apply WAL records as fast as the primary generates them. Ensure the replica has equivalent hardware and I/O capacity. Increase max_wal_senders and wal_keep_size on the primary. On the replica, tune max_parallel_workers and ensure no long-running queries block WAL application. Monitor lag with pg_stat_replication and alert when it exceeds acceptable thresholds.
Reviews
No reviews yet. Be the first to review this template!
Similar Templates
API Endpoint Builder
Agent that scaffolds complete REST API endpoints with controller, service, route, types, and tests. Supports Express, Fastify, and NestJS.
Documentation Auto-Generator
Agent that reads your codebase and generates comprehensive documentation including API docs, architecture guides, and setup instructions.
Ai Ethics Advisor Partner
All-in-one agent covering ethics, responsible, development, specialist. Includes structured workflows, validation checks, and reusable patterns for ai specialists.