Semgrep Rule Creator Framework

Advanced Semgrep custom rule creation toolkit for writing, testing, and deploying application-specific static analysis rules that catch security vulnerabilities and enforce coding patterns.

When to Use This Skill

Choose Semgrep Rule Creator when:

Building custom security rules for your specific framework and patterns
Detecting project-specific anti-patterns that generic linters miss
Enforcing architectural boundaries (layer separation, API conventions)
Creating detection rules for vulnerability patterns found during audits
Automating code review checklist items as static analysis rules

Consider alternatives when:

Need built-in language rules — use ESLint, pylint, or clippy
Need binary analysis — use reverse engineering tools
Need runtime detection — use RASP or WAF rules

Quick Start


# Install Semgrep
pip install semgrep

# Activate rule creator
claude skill activate elite-semgrep-rule-creator-framework

# Create a custom rule
claude "Write a Semgrep rule to detect SQL string concatenation in Python"

# Test rules against code
claude "Test my custom Semgrep rules in rules/ against the src/ directory"

Example Custom Rule


rules:
  - id: sql-string-concatenation
    patterns:
      - pattern: |
          $QUERY = "..." + $USER_INPUT + "..."
      - metavariable-regex:
          metavariable: $QUERY
          regex: ".*(SELECT|INSERT|UPDATE|DELETE|FROM|WHERE).*"
    message: >
      SQL query built with string concatenation using '$USER_INPUT'.
      Use parameterized queries to prevent SQL injection.
    severity: ERROR
    languages: [python]
    metadata:
      cwe: CWE-89
      owasp: A03:2021
      category: security
      confidence: HIGH
    fix: |
      cursor.execute("... ? ...", ($USER_INPUT,))

  - id: no-direct-db-outside-repository
    patterns:
      - pattern: db.$METHOD(...)
      - pattern-not-inside: |
          class $REPO(...Repository):
              ...
    paths:
      exclude:
        - "**/repositories/**"
        - "**/repos/**"
    message: Direct database access must go through repository classes
    severity: WARNING
    languages: [python]
    metadata:
      category: architecture

Core Concepts

Rule Pattern Types

Pattern	Description	Example
`pattern`	Match exact code pattern	`pattern: requests.get($URL)`
`patterns`	AND — all must match	Combine multiple conditions
`pattern-either`	OR — any can match	Alternative vulnerable patterns
`pattern-not`	Exclude specific matches	Filter out safe usage
`pattern-inside`	Must be inside a larger pattern	Within specific function/class
`pattern-not-inside`	Must NOT be inside a pattern	Not within try/catch
`pattern-regex`	Regex on matched code	Complex string matching
`metavariable-regex`	Regex on captured variables	Variable name constraints
`metavariable-pattern`	Pattern on captured variables	Type-based filtering

Rule Development Workflow

Step	Action	Command
Write	Create rule YAML with pattern and metadata	Editor
Test	Run against known-vulnerable and known-safe code	`semgrep --test`
Tune	Adjust patterns to reduce false positives	Iterate pattern logic
Deploy	Add to CI pipeline config	`.github/workflows/semgrep.yml`
Monitor	Track rule performance (TP/FP rates)	Semgrep App dashboard
Iterate	Refine based on production findings	Version control rules


# Test file format for Semgrep rule testing
# test_sql_injection.py

# ruleid: sql-string-concatenation
query = "SELECT * FROM users WHERE name = '" + user_input + "'"

# ok: sql-string-concatenation
cursor.execute("SELECT * FROM users WHERE name = %s", (user_input,))

# ruleid: sql-string-concatenation
sql = "DELETE FROM orders WHERE id = " + order_id + " AND status = 'pending'"

# ok: sql-string-concatenation
safe_query = "SELECT COUNT(*) FROM logs"

Configuration

Parameter	Description	Default
`rules_dir`	Directory containing custom rules	`./rules/`
`target_languages`	Languages to create rules for	Auto-detect
`severity_levels`	Severities: ERROR, WARNING, INFO	`["ERROR", "WARNING"]`
`include_metadata`	Add CWE, OWASP references	`true`
`auto_fix`	Include autofix patterns	`false`
`test_mode`	Run in test validation mode	`false`
`registry_rules`	Include Semgrep registry rules	`["auto"]`

Best Practices

Start with the vulnerability, not the pattern — First identify the specific vulnerability or anti-pattern in real code, then craft the Semgrep pattern to detect it. Working backwards from a known-bad code example produces more accurate rules than trying to imagine all possible vulnerable patterns.
Write both positive and negative test cases — Every rule needs at least 2 ruleid: (should match) and 2 ok: (should not match) test cases. Test edge cases including multiline code, different formatting, and the specific safe alternatives you recommend in the fix message.
Use metavariables for precise matching — Instead of broad regex patterns, use Semgrep's metavariable system ($X, $FUNC, $...ARGS) to capture and constrain specific code elements. This leverages Semgrep's AST awareness and produces fewer false positives than text matching.
Include actionable fix messages with code examples — The rule message should explain why the pattern is dangerous AND provide the correct alternative. Developers who understand the risk and see the fix are more likely to adopt the change than those who just see "rule violated."
Group rules by category and maintain a rule registry — Organize rules into files by category (security, architecture, performance). Maintain a README listing each rule's purpose, false positive rate, and last review date. Delete or disable rules that consistently produce false positives.

Common Issues

Rule matches too broadly, flagging safe code as vulnerable. Add pattern-not and pattern-not-inside clauses to exclude known-safe contexts. Use metavariable-regex to constrain variable names to those that typically carry user input. Test against a large, diverse codebase to find false positive patterns before deploying.

Semgrep doesn't support cross-file or cross-function taint analysis in custom rules. Use Semgrep Pro's taint mode for data flow analysis that tracks user input across function boundaries. For the open-source version, write rules that flag potential sources and sinks separately, then manually verify connections during code review.

Rules work in testing but miss real vulnerable code in production. Real code often has intermediate variables, helper functions, and formatting differences that break exact pattern matches. Use ... (ellipsis) to match any intermediate code, $...ARGS for variadic arguments, and pattern-either to cover alternative code structures.

⚠️ Loading Issue

Elite Semgrep Rule Creator Framework

Semgrep Rule Creator Framework

When to Use This Skill

Quick Start

Example Custom Rule

Core Concepts

Rule Pattern Types

Rule Development Workflow

Configuration

Best Practices

Common Issues

Reviews

Write a review

Similar Templates

Full-Stack Code Reviewer

Test Suite Generator

Pro Architecture Workspace