P

Performance Engineer Companion

Production-ready agent that handles agent, need, identify, eliminate. Includes structured workflows, validation checks, and reusable patterns for development tools.

AgentClipticsdevelopment toolsv1.0.0MIT
0 views0 copies

Performance Engineer Companion

A senior performance engineering agent that optimizes system performance by identifying bottlenecks, conducting load tests, tuning database queries, and ensuring applications meet scalability and latency requirements.

When to Use This Agent

Choose Performance Engineer Companion when:

  • Application response times exceed SLA thresholds
  • Preparing for expected traffic increases (launches, promotions, seasonal)
  • Conducting load and stress testing before production deployment
  • Optimizing database query performance and indexing strategy
  • Profiling CPU, memory, and I/O usage to identify bottlenecks

Consider alternatives when:

  • Debugging functional bugs (use a debugging agent)
  • Setting up monitoring infrastructure (use a DevOps agent)
  • Optimizing build/compilation times (use a build engineering agent)

Quick Start

# .claude/agents/performance-engineer-companion.yml name: Performance Engineer Companion description: Optimize application performance and scalability model: claude-sonnet tools: - Read - Edit - Bash - Glob - Grep

Example invocation:

claude "Profile our API endpoints, identify the slowest ones, and optimize the top 3 performance bottlenecks"

Core Concepts

Performance Analysis Framework

LayerMetricsTools
NetworkLatency, bandwidth, DNS resolutioncurl timing, tcpdump
ApplicationResponse time, throughput, error rateAPM (Datadog, New Relic)
DatabaseQuery time, connection pool, lock contentionEXPLAIN ANALYZE, pg_stat
InfrastructureCPU, memory, disk I/O, network I/Otop, vmstat, iostat
FrontendLCP, FID, CLS, TTFBLighthouse, WebPageTest

Load Testing Configuration

// k6 load test script import http from 'k6/http'; import { check, sleep } from 'k6'; export const options = { stages: [ { duration: '2m', target: 50 }, // Ramp up { duration: '5m', target: 50 }, // Sustained load { duration: '2m', target: 200 }, // Spike test { duration: '5m', target: 200 }, // Sustained spike { duration: '2m', target: 0 }, // Ramp down ], thresholds: { http_req_duration: ['p(95)<500', 'p(99)<1000'], http_req_failed: ['rate<0.01'], http_reqs: ['rate>100'], }, }; export default function () { const res = http.get('https://api.example.com/products'); check(res, { 'status is 200': (r) => r.status === 200, 'response time < 500ms': (r) => r.timings.duration < 500, }); sleep(1); }

Database Query Optimization

-- Before: Full table scan (12 seconds) EXPLAIN ANALYZE SELECT u.*, COUNT(o.id) as order_count FROM users u LEFT JOIN orders o ON o.user_id = u.id WHERE u.created_at > '2025-01-01' GROUP BY u.id ORDER BY order_count DESC LIMIT 50; -- After: Index-optimized (45ms) -- Step 1: Add covering index CREATE INDEX idx_users_created_at ON users(created_at); CREATE INDEX idx_orders_user_id ON orders(user_id); -- Step 2: Restructure query EXPLAIN ANALYZE SELECT u.*, sub.order_count FROM users u JOIN ( SELECT user_id, COUNT(*) as order_count FROM orders GROUP BY user_id ORDER BY order_count DESC LIMIT 50 ) sub ON sub.user_id = u.id WHERE u.created_at > '2025-01-01' ORDER BY sub.order_count DESC;

Configuration

ParameterDescriptionDefault
target_latency_p95Target P95 response time in ms500
target_latency_p99Target P99 response time in ms1000
load_test_toolLoad testing framework (k6, artillery, locust)k6
profilerApplication profiler (clinic.js, py-spy, pprof)Auto-detect
apm_providerAPM tool for production monitoringAuto-detect
databaseDatabase engine for query optimizationAuto-detect

Best Practices

  1. Profile before optimizing β€” the bottleneck is rarely where you think. Developers intuit that "the database is slow" or "the algorithm is inefficient" when the actual bottleneck is serialization overhead, connection pool exhaustion, or unnecessary middleware. Run a profiler (clinic.js for Node.js, py-spy for Python, pprof for Go) and let the data identify the hotspot. Spend optimization effort proportional to each component's contribution to total latency.

  2. Measure latency in percentiles, not averages. An average response time of 200ms hides the fact that 1% of users experience 5-second responses. Always measure and optimize P95 and P99 latency. The worst-case user experience determines customer satisfaction, not the average. Set SLOs based on percentiles: "P95 < 500ms, P99 < 1000ms" and alert when these are breached.

  3. Load test with realistic traffic patterns, not uniform requests. Real traffic includes a mix of endpoint frequencies, authenticated and anonymous users, cache-cold and cache-warm requests, and bursts. Model your load test after production traffic analysis. A uniform 100 req/s to a single endpoint does not predict how the system behaves under real-world mixed workloads.

  4. Optimize the database query first, application code second. In most web applications, database queries account for 60-80% of response time. Use EXPLAIN ANALYZE to verify that queries use indexes, avoid sequential scans on large tables, and do not produce excessive row estimates. Adding an index is almost always a bigger win than optimizing the application code that processes the query results.

  5. Implement caching at the layer closest to the consumer. Cache hierarchy from fastest to slowest: browser cache β†’ CDN β†’ application cache (Redis) β†’ database query cache. Cache at the highest applicable layer. Static assets belong on a CDN. API responses that are the same for all users belong in Redis. Query results that are expensive to compute belong in a database materialized view. Each layer reduces load on the layers below it.

Common Issues

Performance degrades gradually over time as data volume grows. Queries that run fast with 10,000 rows become slow with 10 million rows. Database indexes that covered initial query patterns do not cover new access patterns. Monitor query performance trends weekly, not just during incidents. Add indexes proactively when table sizes cross thresholds. Implement data archival or partitioning strategies before slow queries impact users.

Load tests pass but production fails at the same traffic levels. Load tests often use a clean database, warm caches, and uniform traffic. Production has fragmented data, cold caches after deployments, and traffic spikes. Run load tests against production-scale data volumes. Include cache-miss scenarios by flushing caches before testing. Add spike tests that simulate real-world burst patterns (10x traffic in 30 seconds).

Optimizing one endpoint creates a bottleneck elsewhere. Adding aggressive caching to a frequently-read endpoint reduces database load but increases Redis memory usage and creates cache invalidation complexity. Adding connection pooling to handle more concurrent requests reveals that the downstream payment API has a rate limit. Performance optimization is a system-wide exercise β€” measure the impact on all components, not just the one being optimized.

Community

Reviews

Write a review

No reviews yet. Be the first to review this template!

Similar Templates