G

Gepetto Engine

Boost productivity using this creates, detailed, sectionized, implementation. Includes structured workflows, validation checks, and reusable patterns for ai research.

SkillClipticsai researchv1.0.0MIT
0 views0 copies

Gepetto Reverse Engineering Assistant

Overview

A comprehensive skill for AI-assisted reverse engineering using Gepetto — a plugin that integrates LLMs into IDA Pro and Ghidra to accelerate binary analysis. Gepetto uses GPT-4, Claude, or local models to explain decompiled functions, rename variables, identify vulnerabilities, and provide contextual analysis of disassembled code — dramatically speeding up reverse engineering workflows.

When to Use

  • Analyzing decompiled C/C++ code from binaries
  • Need AI-powered function explanation in IDA Pro or Ghidra
  • Identifying vulnerabilities in compiled binaries
  • Renaming obfuscated variables and functions
  • Understanding malware behavior
  • CTF challenges involving binary exploitation
  • Auditing closed-source software security

Quick Start

# IDA Pro plugin git clone https://github.com/JusticeRage/Gepetto cp gepetto.py /path/to/ida/plugins/ # Ghidra extension # Install via Ghidra Extension Manager # Configure API key # In IDA: Edit → Plugin Options → Gepetto # Set OPENAI_API_KEY or ANTHROPIC_API_KEY
# Or use as a standalone library from gepetto import analyze_function result = analyze_function(""" int __fastcall sub_140001000(__int64 a1, unsigned int a2) { char v3[256]; memcpy(v3, (const void *)(a1 + 8), a2); if (a2 > 0x100) return -1; return process_buffer(v3, a2); } """) print(result.explanation) print(result.renamed_variables) print(result.vulnerabilities)

Core Features

Function Analysis

Right-click on function in IDA → Gepetto → Explain Function

Output:
ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”
│ Function: sub_140001000                              │
│                                                     │
│ Purpose: Copies user-provided data into a local     │
│ buffer and processes it.                             │
│                                                     │
│ Parameters:                                          │
│ - a1 (struct*): Pointer to data structure with      │
│   buffer at offset +8                                │
│ - a2 (size_t): Size of data to copy                 │
│                                                     │
│ Vulnerability: Buffer overflow — memcpy copies a2   │
│ bytes into v3[256] BEFORE checking if a2 > 0x100.   │
│ An attacker can overflow the stack buffer.           │
│                                                     │
│ Suggested names:                                     │
│ - sub_140001000 → copy_and_process_data             │
│ - a1 → input_struct                                  │
│ - a2 → data_size                                     │
│ - v3 → local_buffer                                  │
ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜

Variable Renaming

# Before Gepetto analysis int __fastcall sub_4012B0(int a1, int a2, int a3) { int v4 = *(DWORD *)(a1 + 16); void *v5 = malloc(v4); if (a3 & 1) decrypt_xor(v5, *(BYTE **)(a1 + 8), v4, a2); return send_data(v5, v4); } # After Gepetto analysis int __fastcall send_encrypted_payload( NetworkPacket *packet, int encryption_key, int flags ) { int payload_size = packet->data_length; void *payload_buffer = malloc(payload_size); if (flags & FLAG_ENCRYPTED) decrypt_xor(payload_buffer, packet->data, payload_size, encryption_key); return send_data(payload_buffer, payload_size); }

Vulnerability Detection

# Common patterns Gepetto identifies: vulnerability_patterns = { "buffer_overflow": "memcpy/strcpy without bounds checking", "format_string": "printf(user_input) without format specifier", "use_after_free": "Pointer used after free() called", "integer_overflow": "Arithmetic overflow in size calculation", "race_condition": "TOCTOU in file operations", "null_deref": "Pointer dereference without null check", "uninitialized": "Variable used before initialization", "double_free": "free() called twice on same pointer", }

Configuration

IDA Pro Setup

# gepetto_config.py GEPETTO_CONFIG = { "model": "gpt-4", # or "claude-3-opus", "local" "api_key_env": "OPENAI_API_KEY", "max_tokens": 2000, "temperature": 0.1, # Low temperature for accuracy "context_functions": 3, # Include N related functions for context "auto_rename": True, # Automatically apply renamed variables "highlight_vulns": True, # Highlight vulnerabilities in IDA "language": "en", # Output language }

Local Model Setup

# Use local model via Ollama GEPETTO_CONFIG = { "model": "local", "local_endpoint": "http://localhost:11434/api/generate", "local_model": "codellama:34b", }

Analysis Workflow

StepActionTool
1Load binaryIDA Pro / Ghidra
2Run auto-analysisBuilt-in
3Identify key functionsXrefs, strings, imports
4Explain with GepettoRight-click → Explain
5Rename variablesRight-click → Rename
6Find vulnerabilitiesRight-click → Find Vulns
7Analyze call graphCross-reference analysis
8Document findingsExport annotations

Best Practices

  1. Provide context — Include related functions when analyzing complex code
  2. Use low temperature — 0.1 for analysis accuracy; higher for creative renaming
  3. Verify AI suggestions — Always validate vulnerability claims manually
  4. Start from main/entry — Work outward from entry points for better context
  5. Use struct reconstruction — Let AI suggest struct layouts from field access patterns
  6. Batch analyze — Process related functions together for better naming consistency
  7. Save annotations — Export AI-generated comments to IDB/project files
  8. Combine with dynamic analysis — Use debugger alongside AI static analysis
  9. Use local models for sensitive code — Don't send proprietary code to cloud APIs
  10. Iterate analysis — Re-analyze after renaming for improved subsequent explanations

Troubleshooting

AI gives incorrect analysis

# Provide more context — include caller/callee functions # Reduce temperature for more deterministic output # Specify the binary architecture and OS "This is an x86-64 Windows PE binary compiled with MSVC"

Rate limiting on API

# Add delay between function analyses import time for func in functions_to_analyze: result = gepetto.analyze(func) time.sleep(1) # Rate limit

Decompiler output too complex

# Simplify by analyzing smaller functions first # Break complex functions at natural boundaries # Use Gepetto on individual basic blocks
Community

Reviews

Write a review

No reviews yet. Be the first to review this template!

Similar Templates