O

Ocr Quality Assurance Agent

Enterprise-grade agent for pipeline, validation, specialist, proactively. Includes structured workflows, validation checks, and reusable patterns for ocr extraction team.

AgentClipticsocr extraction teamv1.0.0MIT
0 views0 copies

OCR Quality Assurance Agent

Final-gate validator that cross-references corrected OCR text against original source images to ensure absolute fidelity before production release.

When to Use This Agent

Choose this agent when you need to:

  • Validate the output of a multi-stage OCR correction pipeline before committing text to production
  • Cross-reference corrected documents against source images for pixel-level accuracy
  • Audit markdown rendering integrity after automated formatting passes
  • Generate structured validation reports with approval status and flagged uncertainties

Consider alternatives when:

  • You need the initial OCR extraction itself (use the Visual Analysis Consultant instead)
  • Your workflow only involves plain text without any markdown or structured formatting requirements

Quick Start

Configuration

name: ocr-quality-assurance-agent type: agent category: ocr-extraction-team

Example Invocation

claude agent:invoke ocr-quality-assurance-agent "Validate corrected OCR output for invoice-batch-2026Q1 against source scans"

Example Output

Validation Report β€” invoice-batch-2026Q1
Overall Status: APPROVED WITH NOTES
Content Integrity: 98.7% β€” 2 minor omissions flagged
Correction Accuracy: All 14 corrections verified against source
Markdown Validation: Tables render correctly; 1 escaped pipe fixed
Flagged Issues:
  [REVIEW NEEDED] Page 3, line 42: Ambiguous character β€” could be "l" or "1"
  [REVIEW NEEDED] Page 7, footer: Faded watermark text partially legible
Recommendations: Resolve 2 flagged items before final approval

Core Concepts

OCR Validation Pipeline Overview

AspectDetails
Pipeline PositionStage 5 of 5 β€” final gatekeeper after extraction, comparison, grammar, and formatting
Input ArtifactsOriginal source image, raw OCR text, corrected text with changelog
Validation ScopeCharacter-level accuracy, structural fidelity, markdown syntax, content completeness
Output ArtifactStructured validation report with APPROVED, APPROVED WITH NOTES, or REQUIRES HUMAN REVIEW status

Quality Gate Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Source Image    │────▢│  Section-by-    β”‚
β”‚  (ground truth)  β”‚     β”‚  Section Scan   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
        β”‚                       β”‚
        β–Ό                       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Correction Log  │────▢│  Validation     β”‚
β”‚  (agent diffs)   β”‚     β”‚  Report Builder β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
        β”‚                       β”‚
        β–Ό                       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Markdown Render │────▢│  Final Verdict  β”‚
β”‚  Verification    β”‚     β”‚  & Flag Report  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Configuration

ParameterTypeDefaultDescription
confidence_thresholdfloat0.95Minimum character-level confidence to auto-approve without flagging
flag_markerstring[REVIEW NEEDED]Prefix used for items requiring human attention
strict_modebooleantrueWhen true, any single flagged item downgrades status to APPROVED WITH NOTES
markdown_validatebooleantrueEnable markdown syntax and rendering verification pass
max_ambiguity_countinteger5Maximum flagged ambiguities before auto-escalating to REQUIRES HUMAN REVIEW

Best Practices

  1. Always Preserve the Source Image as Ground Truth Never modify or enhance the original scan before validation. The unaltered source image is the single authoritative reference. Any preprocessing (contrast adjustment, deskewing) should be documented separately so reviewers understand what the validator compared against.

  2. Validate Section by Section, Not All at Once Breaking the document into logical sections (headers, body paragraphs, tables, footnotes) and validating each independently reduces cognitive load and catches errors that a full-document sweep might miss, especially in long or complex documents.

  3. Use Consistent Flag Markers for Downstream Tooling Standardizing on a single marker format like [REVIEW NEEDED: description] allows downstream scripts and human reviewers to programmatically extract and triage every uncertainty. Inconsistent markers lead to missed items and erode trust in the pipeline.

  4. Test Markdown Rendering in the Target Environment Markdown engines differ. A table that renders perfectly in GitHub-flavored markdown may break in a static-site generator. Always verify rendering against the actual target platform, not just a generic preview.

  5. Log Every Correction Trace Back to Visual Evidence Each correction made by upstream agents should be traceable to a specific region in the source image. This audit trail is essential for debugging pipeline regressions and for building confidence in automated correction accuracy over time.

Common Issues

  1. Ambiguous characters in degraded scans (l vs 1, O vs 0) Low-resolution or aged documents frequently produce characters that are visually indistinguishable. Rather than guessing, flag these with the exact bounding region and both candidate interpretations so a human reviewer can resolve them with domain context.

  2. Markdown table columns misalign after correction When upstream agents insert or remove characters in table cells, pipe-delimited columns can shift. Re-validate column alignment by counting delimiters per row and comparing against the header separator row. Automated reformatting scripts can fix this before the final pass.

  3. Content silently omitted from multi-column layouts Multi-column source documents are a common failure point because OCR engines sometimes merge or skip columns. Compare total word count between source and corrected text as a fast sanity check, then drill into any section where the count diverges by more than two percent.

Community

Reviews

Write a review

No reviews yet. Be the first to review this template!

Similar Templates