P

Postgresql Dba Strategist

Streamline your workflow with this work, postgresql, databases, using. Includes structured workflows, validation checks, and reusable patterns for data ai.

AgentClipticsdata aiv1.0.0MIT
0 views0 copies

PostgreSQL DBA Strategist

An expert agent for PostgreSQL database administration covering query optimization, index design, replication, partitioning, vacuum tuning, and production troubleshooting for high-performance PostgreSQL deployments.

When to Use This Agent

Choose PostgreSQL DBA Strategist when:

  • Optimizing PostgreSQL query performance with EXPLAIN ANALYZE
  • Designing partitioning strategies for large tables
  • Configuring streaming replication and failover
  • Tuning autovacuum for high-write workloads
  • Troubleshooting connection issues, bloat, and lock contention

Consider alternatives when:

  • Working with SQL Server or MySQL (use database-specific agents)
  • Using Neon serverless PostgreSQL specifically (use a Neon agent)
  • Designing application data models without DBA concerns (use a backend dev agent)

Quick Start

# .claude/agents/postgresql-dba-strategist.yml name: PostgreSQL DBA model: claude-sonnet-4-20250514 tools: - Read - Write - Bash - Glob - Grep prompt: | You are a PostgreSQL DBA expert. Optimize queries, design indexes, configure replication, tune autovacuum, and troubleshoot production issues. Always use EXPLAIN ANALYZE for diagnosis and verify changes don't degrade other queries.

Example invocation:

claude --agent postgresql-dba-strategist "Our orders table has 500M rows and sequential scans are killing performance. Design a partitioning strategy by order_date and migrate the existing data with minimal downtime."

Core Concepts

Query Optimization Workflow

-- Step 1: Capture the execution plan EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) SELECT * FROM orders WHERE created_at > '2024-01-01' AND status = 'pending'; -- Key metrics to examine: -- Planning Time vs Execution Time -- Seq Scan vs Index Scan (look for unexpected Seq Scans) -- Rows estimated vs actual (large gaps = stale statistics) -- Buffers shared hit vs read (cache effectiveness)

Index Design Guidelines

Query PatternIndex TypeExample
Equality filterB-treeCREATE INDEX ON orders (status)
Range filterB-tree (column order matters)CREATE INDEX ON orders (created_at)
Pattern matchingGIN trigramCREATE INDEX ON users USING GIN (name gin_trgm_ops)
JSON queriesGINCREATE INDEX ON events USING GIN (data jsonb_path_ops)
Full-text searchGIN tsvectorCREATE INDEX ON docs USING GIN (to_tsvector('english', body))
GeospatialGiSTCREATE INDEX ON places USING GIST (location)
Partial dataB-tree filteredCREATE INDEX ON orders (id) WHERE status = 'pending'

Partitioning Strategy

-- Range partitioning by date CREATE TABLE orders ( id bigint GENERATED ALWAYS AS IDENTITY, created_at timestamptz NOT NULL, customer_id bigint, total numeric(10,2) ) PARTITION BY RANGE (created_at); CREATE TABLE orders_2024_q1 PARTITION OF orders FOR VALUES FROM ('2024-01-01') TO ('2024-04-01'); CREATE TABLE orders_2024_q2 PARTITION OF orders FOR VALUES FROM ('2024-04-01') TO ('2024-07-01'); -- Automate future partition creation

Configuration

ParameterDescriptionDefault
shared_buffersShared memory for caching25% of RAM
effective_cache_sizeOS cache estimate for planner75% of RAM
work_memMemory per sort/hash operation64MB
maintenance_work_memMemory for VACUUM, CREATE INDEX512MB
wal_levelWrite-ahead log detailreplica
max_connectionsMaximum client connections200
autovacuum_naptimeAutovacuum check interval30s

Best Practices

  1. Always use EXPLAIN (ANALYZE, BUFFERS) for diagnosis, never guess. The execution plan reveals exactly what PostgreSQL does: which indexes it uses, how many rows it expects versus reality, and whether data comes from shared buffers or disk. A large gap between estimated and actual rows indicates stale statistics that need ANALYZE. Without this data, performance tuning is guesswork.

  2. Tune autovacuum per table, not globally. High-write tables need more aggressive vacuuming than read-heavy tables. Set autovacuum_vacuum_scale_factor to 0.01 (1%) for large tables instead of the default 0.2 (20%). A 500M-row table with default settings won't autovacuum until 100M dead tuples accumulate, causing massive bloat. Table-level settings let you vacuum frequently on hot tables without wasting resources on cold ones.

  3. Design indexes for your most common query patterns, not all patterns. Each index adds write overhead and consumes storage. Profile your actual query workload using pg_stat_statements to find the top 10 queries by total time. Index those patterns first. A covering index that handles your top 3 queries is better than 10 single-column indexes that each help one query.

  4. Use connection pooling in production, always. PostgreSQL forks a process per connection, so 500 idle connections waste significant memory. Use PgBouncer in transaction mode between your application and PostgreSQL. This lets thousands of application connections share a pool of 50-100 actual database connections. Set max_connections based on the pooler's pool size plus administrative connections.

  5. Partition tables before they become too large to manage. Adding partitioning to an existing 500M-row table requires data migration and downtime. Plan partitioning when tables are expected to exceed 50-100M rows. Range partitioning by date is the most common and works well for time-series data. Ensure queries include the partition key in WHERE clauses so the planner can eliminate irrelevant partitions.

Common Issues

Query suddenly slows down after working fine for months. Check for plan changes caused by stale statistics (ANALYZE the table), index bloat (rebuild with REINDEX CONCURRENTLY), or table bloat (check pg_stat_user_tables for dead tuple ratio). Also check if data volume crossed a tipping point where the planner switches from index scan to sequential scan. Force statistics update and compare execution plans before and after.

High CPU usage from autovacuum on busy tables. Autovacuum running on large tables consumes CPU and I/O, potentially affecting production queries. Tune autovacuum_vacuum_cost_delay to throttle vacuuming (increase from 2ms to 10-20ms for less aggressive behavior). Schedule manual VACUUM during off-peak hours for the largest tables. Set autovacuum_max_workers based on your CPU cores and I/O capacity.

Replication lag increases under write-heavy workloads. The replica can't apply WAL records as fast as the primary generates them. Ensure the replica has equivalent hardware and I/O capacity. Increase max_wal_senders and wal_keep_size on the primary. On the replica, tune max_parallel_workers and ensure no long-running queries block WAL application. Monitor lag with pg_stat_replication and alert when it exceeds acceptable thresholds.

Community

Reviews

Write a review

No reviews yet. Be the first to review this template!

Similar Templates