E

Expert Postgres Pro

Boost productivity using this need, optimize, postgresql, performance. Includes structured workflows, validation checks, and reusable patterns for database.

AgentClipticsdatabasev1.0.0MIT
0 views0 copies

Expert Postgres Pro

An advanced PostgreSQL agent covering database administration, performance optimization, replication strategies, backup procedures, and advanced PostgreSQL features for achieving maximum reliability, performance, and scalability in production deployments.

When to Use This Agent

Choose Postgres Pro when:

  • Deep-diving into PostgreSQL internals for performance optimization
  • Configuring advanced features (logical replication, custom types, extensions)
  • Tuning postgresql.conf for specific workload profiles
  • Implementing advanced indexing (BRIN, GiST, GIN, expression indexes)
  • Troubleshooting complex PostgreSQL behavior (planner decisions, lock chains)

Consider alternatives when:

  • Learning PostgreSQL basics (use a general database agent)
  • Working with Neon serverless PostgreSQL specifically (use a Neon agent)
  • Designing schemas without performance focus (use a database architect agent)

Quick Start

# .claude/agents/expert-postgres-pro.yml name: Postgres Pro model: claude-sonnet-4-20250514 tools: - Read - Write - Bash - Glob - Grep prompt: | You are a senior PostgreSQL expert. Master database administration, performance tuning, replication, and advanced features. Use EXPLAIN ANALYZE for every optimization. Understand the planner's decision-making and know when to override it. Prioritize reliability and data integrity.

Example invocation:

claude --agent expert-postgres-pro "Our JSONB query on a 200M row events table takes 8 seconds despite having a GIN index. Analyze the execution plan and suggest optimizations including potentially switching to a GIN jsonb_path_ops index."

Core Concepts

Advanced Index Types

IndexBest ForStorageSpeed
B-treeEquality, range, sortingMediumFast for < 10M rows
HashEquality onlySmallFastest for exact match
GINFull-text, arrays, JSONBLargeFast search, slow update
GiSTGeometry, ranges, nearest-neighborMediumGood for overlap queries
BRINLarge sorted tables (time-series)TinyGood for correlated data
ExpressionComputed valuesSame as base typeMatches expression queries

Configuration Tuning Guide

# Memory Settings (most impactful) shared_buffers = '8GB' # 25% of RAM effective_cache_size = '24GB' # 75% of RAM work_mem = '256MB' # Per-operation sort/hash memory maintenance_work_mem = '2GB' # VACUUM, CREATE INDEX # Write Performance wal_buffers = '64MB' # WAL write buffer checkpoint_completion_target = 0.9 max_wal_size = '4GB' # Query Planner random_page_cost = 1.1 # SSD storage (default 4.0 for HDD) effective_io_concurrency = 200 # SSD concurrent reads default_statistics_target = 500 # More accurate planner estimates # Parallelism max_parallel_workers_per_gather = 4 max_parallel_maintenance_workers = 4 parallel_tuple_cost = 0.01

Lock Chain Analysis

-- Find blocking lock chains SELECT blocked.pid AS blocked_pid, blocked_activity.query AS blocked_query, blocking.pid AS blocking_pid, blocking_activity.query AS blocking_query, blocked.locktype, blocked.relation::regclass FROM pg_catalog.pg_locks blocked JOIN pg_catalog.pg_stat_activity blocked_activity ON blocked.pid = blocked_activity.pid JOIN pg_catalog.pg_locks blocking ON blocked.locktype = blocking.locktype AND blocked.relation = blocking.relation AND blocked.pid != blocking.pid JOIN pg_catalog.pg_stat_activity blocking_activity ON blocking.pid = blocking_activity.pid WHERE NOT blocked.granted;

Configuration

ParameterDescriptionDefault
pg_versionTarget PostgreSQL version16
storage_typeStorage medium (SSD/HDD)SSD
total_ramServer total RAM32 GB
max_connectionsConnection limit200
workload_typeWorkload profileMixed OLTP/OLAP
replication_modeReplication configurationStreaming async
extensionsRequired PostgreSQL extensionspg_stat_statements

Best Practices

  1. Always enable pg_stat_statements in production. This extension tracks execution statistics for all SQL statements with minimal overhead. It tells you which queries consume the most total time, which are called most frequently, and which have the worst per-execution performance. Without it, you're optimizing blind. Query pg_stat_statements by total_exec_time to find your highest-impact optimization targets.

  2. Use BRIN indexes for large time-series tables. B-tree indexes on a 500M-row table consume significant storage and maintenance overhead. If the table is physically ordered by timestamp (which it usually is for append-only data), a BRIN index provides effective filtering at 1/100th the storage of a B-tree. BRIN works by storing min/max values per block range, making it ideal for correlated sequential data.

  3. Set random_page_cost to 1.1 for SSD storage. The default value of 4.0 assumes spinning disks where random reads are 4x slower than sequential reads. On SSDs, random and sequential reads perform similarly. Keeping the default makes the planner avoid index scans in favor of sequential scans even when indexes would be faster, leading to suboptimal query plans.

  4. Use expression indexes for queries on computed values. If you frequently query WHERE lower(email) = lower(input), create an index on lower(email). The index stores precomputed values and matches the query expression exactly. Without the expression index, PostgreSQL must compute lower() for every row during the scan. This pattern works for any immutable function.

  5. Monitor and manage table bloat proactively. PostgreSQL's MVCC creates dead tuples that autovacuum eventually cleans up. But under high-write workloads, dead tuples accumulate faster than autovacuum processes them, inflating table size and degrading scan performance. Monitor pg_stat_user_tables.n_dead_tup relative to n_live_tup. If dead tuples exceed 10% of live tuples, tune autovacuum aggressiveness for that table.

Common Issues

Planner chooses sequential scan over available index scan. The planner estimates that scanning the entire table is cheaper than using the index, which happens when: statistics are stale (run ANALYZE), the query returns a large percentage of rows (index is genuinely slower for > 10% of table), or cost parameters are wrong for your storage (set random_page_cost = 1.1 for SSDs). Don't use SET enable_seqscan = off in production—fix the root cause instead.

VACUUM can't keep up with high-write workloads. Aggressive autovacuum settings help: set autovacuum_vacuum_scale_factor = 0.01 and autovacuum_vacuum_cost_delay = 2ms on high-write tables. If autovacuum still can't keep up, check for long-running transactions that prevent dead tuple cleanup (the visibility horizon problem). A single idle-in-transaction session can prevent vacuuming the entire database.

JSONB queries are slow despite GIN indexes. The default GIN operator class (jsonb_ops) indexes all keys and values, supporting all JSONB operators but consuming significant space. If your queries only use containment operators (@>, ?, ?|, ?&), switch to jsonb_path_ops which is smaller and faster for containment queries. For queries filtering on specific JSON paths, consider expression indexes on the extracted values instead.

Community

Reviews

Write a review

No reviews yet. Be the first to review this template!

Similar Templates