E

Expert Nosql Specialist

Powerful agent for nosql, database, specialist, mongodb. Includes structured workflows, validation checks, and reusable patterns for database.

AgentClipticsdatabasev1.0.0MIT
0 views0 copies

Expert NoSQL Specialist

An agent providing professional design patterns and mental models for distributed wide-column and key-value stores, specifically Apache Cassandra and Amazon DynamoDB, with focus on query-first data modeling and partition design for high-scale applications.

When to Use This Agent

Choose NoSQL Specialist when:

  • Designing DynamoDB or Cassandra data models for high-throughput workloads
  • Implementing single-table design patterns for DynamoDB
  • Planning partition strategies to avoid hot partitions
  • Migrating from relational models to wide-column/key-value stores
  • Optimizing read/write patterns for distributed NoSQL systems

Consider alternatives when:

  • Working with document databases like MongoDB (use a MongoDB specialist)
  • Designing relational schemas (use a database architect agent)
  • Using Redis for caching without persistence concerns (use a caching agent)

Quick Start

# .claude/agents/expert-nosql-specialist.yml name: NoSQL Specialist model: claude-sonnet-4-20250514 tools: - Read - Write - Bash - Glob - Grep prompt: | You are a NoSQL expert specializing in DynamoDB and Cassandra. Model data query-first: define access patterns before designing tables. Optimize partition keys for even distribution and use single-table design for DynamoDB. Always consider consistency trade-offs and capacity planning.

Example invocation:

claude --agent expert-nosql-specialist "Design a DynamoDB single-table schema for an e-commerce order system supporting: get order by ID, list orders by customer, list orders by status, and get order items. Expected: 10K orders/day, 5 items avg per order."

Core Concepts

Query-First Design Process

1. List all access patterns (read and write)
2. Identify primary key (partition + sort key)
3. Design secondary indexes for additional patterns
4. Plan item collections for related data
5. Validate: each access pattern maps to a single query

DynamoDB Single-Table Design

PK                  SK                    Data
USER#123           PROFILE               {name, email, ...}
USER#123           ORDER#2024-001        {total, status, ...}
USER#123           ORDER#2024-002        {total, status, ...}
ORDER#2024-001     ITEM#PROD-A           {qty, price, ...}
ORDER#2024-001     ITEM#PROD-B           {qty, price, ...}

GSI1PK             GSI1SK                Purpose
STATUS#pending     2024-03-14T10:00      Orders by status+date
PROD#A             ORDER#2024-001        Orders containing product

Partition Strategy

PatternKey DesignDistribution
User-centricUSER#{userId}Even if many users
Time-seriesSENSOR#{id}#2024-03Monthly partitions
High-cardinalityORDER#{orderId}Naturally distributed
Status-basedSTATUS#{status}#{date}Prefix prevents hot partition
HierarchicalTENANT#{id}#DEPT#{dept}Composite for isolation

Configuration

ParameterDescriptionDefault
databaseTarget NoSQL databaseDynamoDB
table_designSingle-table or multi-tableSingle-table
capacity_modeProvisioned or on-demandOn-demand
consistencyRead consistency modelEventually consistent
gsi_countMax global secondary indexes5
item_size_limitMax item size400 KB (DynamoDB)
ttl_enabledTime-to-live for auto-expirytrue

Best Practices

  1. Define every access pattern before designing the first table. In NoSQL, the data model exists to serve queries—not to represent entities. Write down every query your application needs: "Get user by ID," "List orders by user sorted by date," "Find all pending orders." Each pattern must map to a single, efficient query. If you can't serve a pattern without a full table scan, redesign the keys.

  2. Use composite sort keys to support multiple query patterns. A sort key like STATUS#pending#2024-03-14 supports three queries: exact status lookup, status within date range, and all items with a specific status prefix. Design sort keys as hierarchical paths where each prefix level supports a useful query pattern through begins_with operations.

  3. Prevent hot partitions by distributing writes evenly. If one partition key receives disproportionate traffic, that partition becomes a bottleneck regardless of total table capacity. Avoid using low-cardinality values (status, country) as partition keys. For inherently skewed data, add a suffix shard: STATUS#pending#0 through STATUS#pending#9 distributes writes across 10 partitions.

  4. Use TTL for automatic data lifecycle management. Set TTL on items that have a natural expiration: session data, temporary tokens, log entries, and notification records. TTL deletes consume no write capacity and keep the table size manageable. For audit purposes, configure DynamoDB Streams to capture TTL deletions and archive them to S3 before they disappear.

  5. Denormalize aggressively—joins don't exist in NoSQL. Store related data together in the same item or item collection (same partition key). If displaying an order requires the customer name, store the customer name in the order item. Data duplication is the intended trade-off for read performance. Update denormalized data through DynamoDB Streams or application-level consistency.

Common Issues

Queries require scanning the entire table. This means the access pattern wasn't anticipated in the table design. Every query must be satisfiable by a partition key + sort key lookup or a global secondary index. If a new access pattern emerges that doesn't fit existing keys or indexes, add a GSI with appropriate key projections rather than scanning. Design reviews should verify all current and planned access patterns have efficient query paths.

Hot partition throttling despite high provisioned capacity. DynamoDB distributes capacity across partitions, not the whole table. One partition receiving 90% of requests gets throttled even if the table has spare capacity. Identify the hot key using CloudWatch contributor insights, then redesign the partition key to distribute load. For time-series data, add date or shard suffixes to the partition key to spread writes across partitions.

Item size exceeds the 400KB DynamoDB limit. Store large attributes (images, documents, large JSON) in S3 and save only the S3 reference in DynamoDB. For items approaching the limit due to growing lists, split the item into multiple items using a sort key pattern: ORDER#001#DETAILS, ORDER#001#ITEMS#1, ORDER#001#ITEMS#2. This pattern supports unlimited data growth within the partition.

Community

Reviews

Write a review

No reviews yet. Be the first to review this template!

Similar Templates