Markdown Syntax Strategist

Intelligent markdown formatting agent that transforms raw OCR output and unstructured text into clean, specification-compliant markdown with proper heading hierarchy, list syntax, and code blocks.

When to Use This Agent

Choose this agent when you need to:

Convert OCR-extracted text with visual formatting cues into proper CommonMark syntax
Fix heading hierarchy violations where levels are skipped or inconsistently applied
Normalize list markers, indentation, and nested structure across large documents
Add language identifiers to code blocks and properly format inline code references

Consider alternatives when:

Your text has OCR character-level errors like "rn" misread as "m" (use Specialist OCR Grammar Fixer first)
You need to analyze document layout before text extraction (use Document Structure Analyzer Companion)

Quick Start

Configuration


name: markdown-syntax-strategist
type: agent
category: ocr-extraction-team

Example Invocation


claude agent:invoke markdown-syntax-strategist "Format the OCR output from technical-manual.txt into clean markdown"

Example Output

=== Markdown Formatting Report ===
Input: technical-manual.txt (4,200 words)

TRANSFORMATIONS APPLIED:
  Headings fixed:       14 (ALL CAPS → proper # syntax)
  Lists normalized:     8 blocks (mixed •/*/- → consistent -)
  Code blocks added:    6 (with language identifiers: python, bash, yaml)
  Inline code wrapped:  23 technical terms
  Emphasis corrected:   11 instances
  Line spacing fixed:   34 locations

HEADING HIERARCHY:
  # Installation Guide
  ## Prerequisites
  ### System Requirements
  ## Configuration
  ### Database Setup
  ### Environment Variables

VALIDATION: All syntax passes CommonMark spec check

Core Concepts

Markdown Element Priorities Overview

Aspect	Details
Heading Hierarchy	Strict H1-H6 progression with no level skipping allowed
List Consistency	Uniform markers (- for unordered, 1. for ordered) with 2-space nesting
Code Formatting	Triple backticks with language hints for blocks, single backticks inline
Emphasis Rules	Double asterisks for bold, single for italic, never underscores

Formatting Pipeline Architecture

┌─────────────┐     ┌─────────────┐
│  Raw Text    │────▶│  Heading    │
│  Ingestion   │     │  Normalizer │
└─────────────┘     └─────────────┘
        │                   │
        ▼                   ▼
┌─────────────┐     ┌─────────────┐
│  List & Code │────▶│  Validation │
│  Formatter   │     │  & Output   │
└─────────────┘     └─────────────┘

Configuration

Parameter	Type	Default	Description
spec_compliance	string	"commonmark"	Target markdown specification: commonmark or gfm (GitHub Flavored)
list_marker	string	"-"	Default unordered list marker character to normalize to
nest_indent	integer	2	Number of spaces per indentation level for nested list items
auto_language_detect	boolean	true	Attempt to detect programming language for unlabeled code blocks
preserve_html	boolean	false	Whether to keep inline HTML tags or convert them to markdown equivalents

Best Practices

Process Grammar Fixes Before Formatting OCR text often contains character-level errors that affect formatting decisions. Running the grammar fixer first ensures the markdown strategist works with clean text, preventing misidentification of headings or list items.
Use GFM Mode for Technical Documentation GitHub Flavored Markdown supports tables, task lists, and strikethrough syntax that CommonMark does not. Set spec_compliance to "gfm" when processing technical documents that likely contain these elements.
Validate Output Against a Markdown Linter After formatting, run the output through a markdown linter like markdownlint to catch edge cases the agent may miss, such as trailing spaces, missing blank lines around headings, or inconsistent emphasis markers.
Preserve Intentional Formatting Exceptions Some documents use non-standard formatting deliberately, such as ALL CAPS for legal disclaimers. Add a  comment above sections that should not be transformed.
Handle Nested Structures Incrementally Deeply nested lists within lists within blockquotes are error-prone to format in a single pass. Process the outermost structure first, then refine inner nesting in subsequent passes for more reliable results.

Common Issues

ALL CAPS text incorrectly converted to headings Not all uppercase text represents headings. The agent uses heuristics like line position, surrounding whitespace, and text length to distinguish headings from emphasized text, but short ALL CAPS phrases in body text may be misidentified. Add those lines to an exclusion list.
Code blocks missing language identifiers after formatting Auto-detection relies on syntax patterns and keywords. If a code block uses an uncommon language or is too short for reliable detection, manually specify the language by adding a hint comment like  above the block.
Ordered list numbering resets unexpectedly When a paragraph or other block element interrupts an ordered list, CommonMark treats the subsequent items as a new list. Ensure there are no blank paragraphs between consecutive ordered list items, or use a lazy continuation marker.

⚠️ Loading Issue

Markdown Syntax Strategist

Markdown Syntax Strategist

When to Use This Agent

Quick Start

Configuration

Example Invocation

Example Output

Core Concepts

Markdown Element Priorities Overview

Formatting Pipeline Architecture

Configuration

Best Practices

Common Issues

Reviews

Write a review

Similar Templates

API Endpoint Builder

Documentation Auto-Generator

Ai Ethics Advisor Partner