Elite Semgrep Rule Creator Framework
Streamline your workflow with this skill for create and refine security vulnerability detection rules. Built for Claude Code with best practices and real-world patterns.
Semgrep Rule Creator Framework
Advanced Semgrep custom rule creation toolkit for writing, testing, and deploying application-specific static analysis rules that catch security vulnerabilities and enforce coding patterns.
When to Use This Skill
Choose Semgrep Rule Creator when:
- Building custom security rules for your specific framework and patterns
- Detecting project-specific anti-patterns that generic linters miss
- Enforcing architectural boundaries (layer separation, API conventions)
- Creating detection rules for vulnerability patterns found during audits
- Automating code review checklist items as static analysis rules
Consider alternatives when:
- Need built-in language rules — use ESLint, pylint, or clippy
- Need binary analysis — use reverse engineering tools
- Need runtime detection — use RASP or WAF rules
Quick Start
# Install Semgrep pip install semgrep # Activate rule creator claude skill activate elite-semgrep-rule-creator-framework # Create a custom rule claude "Write a Semgrep rule to detect SQL string concatenation in Python" # Test rules against code claude "Test my custom Semgrep rules in rules/ against the src/ directory"
Example Custom Rule
rules: - id: sql-string-concatenation patterns: - pattern: | $QUERY = "..." + $USER_INPUT + "..." - metavariable-regex: metavariable: $QUERY regex: ".*(SELECT|INSERT|UPDATE|DELETE|FROM|WHERE).*" message: > SQL query built with string concatenation using '$USER_INPUT'. Use parameterized queries to prevent SQL injection. severity: ERROR languages: [python] metadata: cwe: CWE-89 owasp: A03:2021 category: security confidence: HIGH fix: | cursor.execute("... ? ...", ($USER_INPUT,)) - id: no-direct-db-outside-repository patterns: - pattern: db.$METHOD(...) - pattern-not-inside: | class $REPO(...Repository): ... paths: exclude: - "**/repositories/**" - "**/repos/**" message: Direct database access must go through repository classes severity: WARNING languages: [python] metadata: category: architecture
Core Concepts
Rule Pattern Types
| Pattern | Description | Example |
|---|---|---|
pattern | Match exact code pattern | pattern: requests.get($URL) |
patterns | AND — all must match | Combine multiple conditions |
pattern-either | OR — any can match | Alternative vulnerable patterns |
pattern-not | Exclude specific matches | Filter out safe usage |
pattern-inside | Must be inside a larger pattern | Within specific function/class |
pattern-not-inside | Must NOT be inside a pattern | Not within try/catch |
pattern-regex | Regex on matched code | Complex string matching |
metavariable-regex | Regex on captured variables | Variable name constraints |
metavariable-pattern | Pattern on captured variables | Type-based filtering |
Rule Development Workflow
| Step | Action | Command |
|---|---|---|
| Write | Create rule YAML with pattern and metadata | Editor |
| Test | Run against known-vulnerable and known-safe code | semgrep --test |
| Tune | Adjust patterns to reduce false positives | Iterate pattern logic |
| Deploy | Add to CI pipeline config | .github/workflows/semgrep.yml |
| Monitor | Track rule performance (TP/FP rates) | Semgrep App dashboard |
| Iterate | Refine based on production findings | Version control rules |
# Test file format for Semgrep rule testing # test_sql_injection.py # ruleid: sql-string-concatenation query = "SELECT * FROM users WHERE name = '" + user_input + "'" # ok: sql-string-concatenation cursor.execute("SELECT * FROM users WHERE name = %s", (user_input,)) # ruleid: sql-string-concatenation sql = "DELETE FROM orders WHERE id = " + order_id + " AND status = 'pending'" # ok: sql-string-concatenation safe_query = "SELECT COUNT(*) FROM logs"
Configuration
| Parameter | Description | Default |
|---|---|---|
rules_dir | Directory containing custom rules | ./rules/ |
target_languages | Languages to create rules for | Auto-detect |
severity_levels | Severities: ERROR, WARNING, INFO | ["ERROR", "WARNING"] |
include_metadata | Add CWE, OWASP references | true |
auto_fix | Include autofix patterns | false |
test_mode | Run in test validation mode | false |
registry_rules | Include Semgrep registry rules | ["auto"] |
Best Practices
-
Start with the vulnerability, not the pattern — First identify the specific vulnerability or anti-pattern in real code, then craft the Semgrep pattern to detect it. Working backwards from a known-bad code example produces more accurate rules than trying to imagine all possible vulnerable patterns.
-
Write both positive and negative test cases — Every rule needs at least 2
ruleid:(should match) and 2ok:(should not match) test cases. Test edge cases including multiline code, different formatting, and the specific safe alternatives you recommend in the fix message. -
Use metavariables for precise matching — Instead of broad regex patterns, use Semgrep's metavariable system (
$X,$FUNC,$...ARGS) to capture and constrain specific code elements. This leverages Semgrep's AST awareness and produces fewer false positives than text matching. -
Include actionable fix messages with code examples — The rule message should explain why the pattern is dangerous AND provide the correct alternative. Developers who understand the risk and see the fix are more likely to adopt the change than those who just see "rule violated."
-
Group rules by category and maintain a rule registry — Organize rules into files by category (security, architecture, performance). Maintain a README listing each rule's purpose, false positive rate, and last review date. Delete or disable rules that consistently produce false positives.
Common Issues
Rule matches too broadly, flagging safe code as vulnerable. Add pattern-not and pattern-not-inside clauses to exclude known-safe contexts. Use metavariable-regex to constrain variable names to those that typically carry user input. Test against a large, diverse codebase to find false positive patterns before deploying.
Semgrep doesn't support cross-file or cross-function taint analysis in custom rules. Use Semgrep Pro's taint mode for data flow analysis that tracks user input across function boundaries. For the open-source version, write rules that flag potential sources and sinks separately, then manually verify connections during code review.
Rules work in testing but miss real vulnerable code in production. Real code often has intermediate variables, helper functions, and formatting differences that break exact pattern matches. Use ... (ellipsis) to match any intermediate code, $...ARGS for variadic arguments, and pattern-either to cover alternative code structures.
Reviews
No reviews yet. Be the first to review this template!
Similar Templates
Full-Stack Code Reviewer
Comprehensive code review skill that checks for security vulnerabilities, performance issues, accessibility, and best practices across frontend and backend code.
Test Suite Generator
Generates comprehensive test suites with unit tests, integration tests, and edge cases. Supports Jest, Vitest, Pytest, and Go testing.
Pro Architecture Workspace
Battle-tested skill for architectural, decision, making, framework. Includes structured workflows, validation checks, and reusable patterns for development.