Prompt Engineering Complete
Production-ready skill that handles expert, guide, prompt, engineering. Includes structured workflows, validation checks, and reusable patterns for ai research.
Prompt Engineering Complete
Advanced prompt engineering techniques for maximizing LLM performance, reliability, and controllability — covering few-shot learning, chain-of-thought, structured outputs, and systematic evaluation.
When to Use
Apply advanced techniques when:
- Standard prompts produce inconsistent or low-quality outputs
- Tasks require complex reasoning (math, code analysis, multi-step logic)
- Need reliable structured output (JSON, XML, tables)
- Building production systems that need predictable behavior
Keep prompts simple when:
- Basic Q&A or conversational tasks
- Model already performs well with minimal instruction
- Prototyping before optimization
Quick Start
Few-Shot Learning
Classify the sentiment of each review.
Review: "The battery life is incredible, lasts all day."
Sentiment: Positive
Aspect: Battery
Review: "Screen is beautiful but the camera is disappointing."
Sentiment: Mixed
Aspect: Screen (positive), Camera (negative)
Review: "Overpriced for what you get. Build quality feels cheap."
Sentiment: Negative
Aspect: Price, Build Quality
Review: "{user_input}"
Sentiment:
Chain-of-Thought Prompting
Solve this step by step. Show your reasoning before giving the final answer.
Problem: A store has 45 apples. They sell 12 in the morning and receive a shipment of 30 in the afternoon. A customer then buys 8. How many apples remain?
Reasoning:
1. Start: 45 apples
2. Morning sales: 45 - 12 = 33
3. Afternoon shipment: 33 + 30 = 63
4. Customer purchase: 63 - 8 = 55
Answer: 55 apples
Problem: "{user_problem}"
Reasoning:
Structured Output with XML Tags
Analyze the code and provide your review in the following format:
<review>
<summary>One-sentence overview of the code quality</summary>
<issues>
<issue severity="critical|major|minor">
<description>What the issue is</description>
<line>Line number</line>
<fix>Suggested fix</fix>
</issue>
</issues>
<score>1-10 quality score</score>
</review>
Core Concepts
Technique Selection Guide
| Technique | Best For | Accuracy Gain | Token Cost |
|---|---|---|---|
| Zero-shot | Simple tasks | Baseline | Low |
| Few-shot (2-5 examples) | Format consistency | +15-25% | Medium |
| Chain-of-thought | Reasoning tasks | +20-40% | High |
| Self-consistency | Critical decisions | +10-15% | Very high |
| Tree-of-thought | Complex planning | +25-50% | Very high |
Prompt Structure Hierarchy
System Prompt (persistent context)
├── Role definition
├── Domain knowledge
├── Behavioral constraints
└── Output format specification
User Prompt (per-request)
├── Task instruction
├── Input data (with delimiters)
├── Few-shot examples (if needed)
└── Specific output requirements
Advanced Techniques
Self-Consistency (majority voting):
responses = [llm(prompt, temperature=0.7) for _ in range(5)] # Extract answers and take the most common one from collections import Counter answers = [extract_answer(r) for r in responses] final_answer = Counter(answers).most_common(1)[0][0]
Recursive Decomposition:
Step 1: Break the problem into sub-problems
Step 2: Solve each sub-problem independently
Step 3: Combine sub-solutions into the final answer
For each sub-problem, if it's still complex, repeat steps 1-3.
Configuration
| Technique | Temperature | Top-P | Max Tokens | Notes |
|---|---|---|---|---|
| Factual Q&A | 0.0 | 1.0 | 256 | Deterministic |
| Creative writing | 0.8-1.0 | 0.95 | 2048 | High variability |
| Code generation | 0.0-0.2 | 1.0 | 1024 | Low variability |
| Analysis/reasoning | 0.0-0.3 | 1.0 | 512 | Slightly creative |
| Self-consistency | 0.7 | 0.95 | 512 | Needs variability |
Best Practices
- Match technique to task complexity — don't use chain-of-thought for simple classifications
- Place instructions at the end for tasks with long context — models attend more to recent tokens
- Use delimiters consistently — XML tags, triple backticks, or section headers to separate input from instructions
- Test with diverse inputs — edge cases, adversarial examples, and out-of-distribution queries
- Measure before and after — quantify accuracy gains from each technique to justify token cost
- Combine techniques judiciously — few-shot + chain-of-thought often outperforms either alone
Common Issues
Chain-of-thought produces wrong reasoning: Provide an example of correct reasoning in the prompt. Use "Let's verify: ..." at the end to self-check. For math, specify "Show your arithmetic" to make errors visible.
Few-shot examples bias the model: Ensure examples cover diverse cases, not just one pattern. Include edge cases and negative examples. Randomize example order to avoid position bias.
Structured output has formatting errors: Provide a complete example output. Add "IMPORTANT: Your response must be valid JSON/XML" as a constraint. Use output parsing with retry logic in production code.
Reviews
No reviews yet. Be the first to review this template!
Similar Templates
Full-Stack Code Reviewer
Comprehensive code review skill that checks for security vulnerabilities, performance issues, accessibility, and best practices across frontend and backend code.
Test Suite Generator
Generates comprehensive test suites with unit tests, integration tests, and edge cases. Supports Jest, Vitest, Pytest, and Go testing.
Pro Architecture Workspace
Battle-tested skill for architectural, decision, making, framework. Includes structured workflows, validation checks, and reusable patterns for development.