Document Structure Analyzer Companion

Layout analysis and semantic mapping agent that deconstructs document structures into labeled regions, reading orders, and hierarchical content schemas for OCR preprocessing.

When to Use This Agent

Choose this agent when you need to:

Analyze complex multi-column layouts before running OCR to improve extraction accuracy
Map document hierarchies (headers, subheaders, body text, captions) for structured output
Identify and classify visual elements like tables, forms, figures, and sidebars
Determine correct reading order for documents with non-linear content flow

Consider alternatives when:

Your documents are simple single-column text with no complex layout elements
You need post-OCR grammar correction rather than pre-OCR structural analysis (use Specialist OCR Grammar Fixer)

Quick Start

Configuration


name: document-structure-analyzer-companion
type: agent
category: ocr-extraction-team

Example Invocation


claude agent:invoke document-structure-analyzer-companion "Analyze the structure of invoice-batch-042.pdf"

Example Output

=== Document Structure Analysis ===
File: invoice-batch-042.pdf (3 pages)

PAGE 1 REGIONS:
  [HEADER]     Logo + Company Name (confidence: 0.97)
  [TABLE]      Line items table, 5 columns x 12 rows (confidence: 0.94)
  [SIDEBAR]    Payment terms block, right margin (confidence: 0.89)
  [FOOTER]     Page number + legal disclaimer (confidence: 0.96)

READING ORDER: Header → Table → Sidebar → Footer
HIERARCHY: H1(Invoice #) → H2(Bill To, Ship To) → Body(line items)
TEMPLATE MATCH: Standard commercial invoice (92% match)

OCR RECOMMENDATIONS:
  - Process table region with grid-aware extraction
  - Treat sidebar as independent text block
  - Apply deskew correction (2.1° detected)

Core Concepts

Document Region Types Overview

Aspect	Details
Content Blocks	Continuous text regions like paragraphs, headings, and captions
Tabular Regions	Structured grid areas including tables, forms, and ledgers
Visual Elements	Non-text regions such as images, charts, logos, and diagrams
Navigation Markers	Page numbers, headers, footers, and section dividers

Structure Analysis Pipeline Architecture

┌─────────────┐     ┌─────────────┐
│  Page Image  │────▶│  Region     │
│  Ingestion   │     │  Segmenter  │
└─────────────┘     └─────────────┘
        │                   │
        ▼                   ▼
┌─────────────┐     ┌─────────────┐
│  Reading     │────▶│  Hierarchy  │
│  Order Engine│     │  Mapper     │
└─────────────┘     └─────────────┘

Configuration

Parameter	Type	Default	Description
min_confidence	float	0.80	Minimum confidence score for a region classification to be included
deskew_correction	boolean	true	Automatically detect and correct page rotation before analysis
table_detection_mode	string	"grid-aware"	Table detection strategy: grid-aware, line-based, or whitespace
max_pages	integer	50	Maximum number of pages to analyze in a single invocation
output_format	string	"json"	Output format for structure maps: json, yaml, or markdown

Best Practices

Run Structure Analysis Before OCR Extraction Feeding region boundaries and reading order to the OCR engine dramatically improves extraction accuracy. Without structure analysis, OCR processes text linearly and mangles multi-column layouts, tables, and sidebars.
Calibrate Confidence Thresholds Per Document Type Scanned handwritten documents produce lower confidence scores than clean digital PDFs. Lower min_confidence to 0.65 for handwritten or degraded inputs to avoid discarding valid but uncertain region detections.
Use Template Matching for Recurring Document Types If you process the same form or invoice layout repeatedly, save the structure analysis as a template. Future documents matching that template skip the full analysis pipeline and process significantly faster.
Verify Reading Order on Complex Layouts Multi-column academic papers, magazine spreads, and brochures have non-obvious reading orders. Always review the suggested reading order for complex layouts, as incorrect ordering produces incoherent OCR output.
Separate Table Regions for Dedicated Processing Tables require grid-aware extraction that differs fundamentally from paragraph text OCR. The structure analyzer marks table boundaries so downstream processors can apply specialized table extraction algorithms.

Common Issues

Sidebar text merged with main body content in reading order Sidebars positioned close to the main text column may be incorrectly merged. Increase the column_gap_threshold parameter to require a wider whitespace gap before treating adjacent regions as separate columns.
Table detection fails on borderless tables Tables without visible gridlines require whitespace-based detection. Switch table_detection_mode from "grid-aware" to "whitespace" for documents that use spacing rather than borders to delineate table cells.
Rotated or skewed pages produce misaligned region boundaries Even with deskew_correction enabled, pages rotated more than 5 degrees may not correct fully. Pre-process heavily skewed documents with a dedicated image rotation tool before submitting them for structure analysis.

⚠️ Loading Issue

Document Structure Analyzer Companion

Document Structure Analyzer Companion

When to Use This Agent

Quick Start

Configuration

Example Invocation

Example Output

Core Concepts

Document Region Types Overview

Structure Analysis Pipeline Architecture

Configuration

Best Practices

Common Issues

Reviews

Write a review

Similar Templates

API Endpoint Builder

Documentation Auto-Generator

Ai Ethics Advisor Partner