Prompt Engineering Complete

Advanced prompt engineering techniques for maximizing LLM performance, reliability, and controllability — covering few-shot learning, chain-of-thought, structured outputs, and systematic evaluation.

When to Use

Apply advanced techniques when:

Standard prompts produce inconsistent or low-quality outputs
Tasks require complex reasoning (math, code analysis, multi-step logic)
Need reliable structured output (JSON, XML, tables)
Building production systems that need predictable behavior

Keep prompts simple when:

Basic Q&A or conversational tasks
Model already performs well with minimal instruction
Prototyping before optimization

Quick Start

Few-Shot Learning

Classify the sentiment of each review.

Review: "The battery life is incredible, lasts all day."
Sentiment: Positive
Aspect: Battery

Review: "Screen is beautiful but the camera is disappointing."
Sentiment: Mixed
Aspect: Screen (positive), Camera (negative)

Review: "Overpriced for what you get. Build quality feels cheap."
Sentiment: Negative
Aspect: Price, Build Quality

Review: "{user_input}"
Sentiment:

Chain-of-Thought Prompting

Solve this step by step. Show your reasoning before giving the final answer.

Problem: A store has 45 apples. They sell 12 in the morning and receive a shipment of 30 in the afternoon. A customer then buys 8. How many apples remain?

Reasoning:
1. Start: 45 apples
2. Morning sales: 45 - 12 = 33
3. Afternoon shipment: 33 + 30 = 63
4. Customer purchase: 63 - 8 = 55

Answer: 55 apples

Problem: "{user_problem}"
Reasoning:

Structured Output with XML Tags

Analyze the code and provide your review in the following format:

<review>
  <summary>One-sentence overview of the code quality</summary>
  <issues>
    <issue severity="critical|major|minor">
      <description>What the issue is</description>
      <line>Line number</line>
      <fix>Suggested fix</fix>
    </issue>
  </issues>
  <score>1-10 quality score</score>
</review>

Core Concepts

Technique Selection Guide

Technique	Best For	Accuracy Gain	Token Cost
Zero-shot	Simple tasks	Baseline	Low
Few-shot (2-5 examples)	Format consistency	+15-25%	Medium
Chain-of-thought	Reasoning tasks	+20-40%	High
Self-consistency	Critical decisions	+10-15%	Very high
Tree-of-thought	Complex planning	+25-50%	Very high

Prompt Structure Hierarchy

System Prompt (persistent context)
  ├── Role definition
  ├── Domain knowledge
  ├── Behavioral constraints
  └── Output format specification

User Prompt (per-request)
  ├── Task instruction
  ├── Input data (with delimiters)
  ├── Few-shot examples (if needed)
  └── Specific output requirements

Advanced Techniques

Self-Consistency (majority voting):


responses = [llm(prompt, temperature=0.7) for _ in range(5)]
# Extract answers and take the most common one
from collections import Counter
answers = [extract_answer(r) for r in responses]
final_answer = Counter(answers).most_common(1)[0][0]

Recursive Decomposition:

Step 1: Break the problem into sub-problems
Step 2: Solve each sub-problem independently
Step 3: Combine sub-solutions into the final answer

For each sub-problem, if it's still complex, repeat steps 1-3.

Configuration

Technique	Temperature	Top-P	Max Tokens	Notes
Factual Q&A	0.0	1.0	256	Deterministic
Creative writing	0.8-1.0	0.95	2048	High variability
Code generation	0.0-0.2	1.0	1024	Low variability
Analysis/reasoning	0.0-0.3	1.0	512	Slightly creative
Self-consistency	0.7	0.95	512	Needs variability

Best Practices

Match technique to task complexity — don't use chain-of-thought for simple classifications
Place instructions at the end for tasks with long context — models attend more to recent tokens
Use delimiters consistently — XML tags, triple backticks, or section headers to separate input from instructions
Test with diverse inputs — edge cases, adversarial examples, and out-of-distribution queries
Measure before and after — quantify accuracy gains from each technique to justify token cost
Combine techniques judiciously — few-shot + chain-of-thought often outperforms either alone

Common Issues

Chain-of-thought produces wrong reasoning: Provide an example of correct reasoning in the prompt. Use "Let's verify: ..." at the end to self-check. For math, specify "Show your arithmetic" to make errors visible.

Few-shot examples bias the model: Ensure examples cover diverse cases, not just one pattern. Include edge cases and negative examples. Randomize example order to avoid position bias.

Structured output has formatting errors: Provide a complete example output. Add "IMPORTANT: Your response must be valid JSON/XML" as a constraint. Use output parsing with retry logic in production code.

⚠️ Loading Issue

Prompt Engineering Complete

Prompt Engineering Complete

When to Use

Quick Start

Few-Shot Learning

Chain-of-Thought Prompting

Structured Output with XML Tags

Core Concepts

Technique Selection Guide

Prompt Structure Hierarchy

Advanced Techniques

Configuration

Best Practices

Common Issues

Reviews

Write a review

Similar Templates

Full-Stack Code Reviewer

Test Suite Generator

Pro Architecture Workspace