Dynamic Word Document Processing Studio
A comprehensive skill that enables create, edit, and analyze documents with tracked changes. Built for Claude Code with best practices and real-world patterns.
Word Document Processing
A document automation skill for programmatically creating, reading, modifying, and converting Microsoft Word documents using libraries like docx, python-docx, and mammoth.
When to Use
Choose Word Document Processing when:
- Generating reports, invoices, or contracts programmatically from templates
- Extracting text and metadata from Word documents for processing
- Converting Word documents to PDF, HTML, or plain text
- Automating mail merge and document assembly workflows
Consider alternatives when:
- Creating simple text documents — use plain text or Markdown
- Building PDF-first documents — use a PDF generation library directly
- Interactive document editing — use Google Docs API or Office 365 API
Quick Start
# Python pip install python-docx mammoth # Node.js npm install docx mammoth
from docx import Document from docx.shared import Inches, Pt, RGBColor from docx.enum.text import WD_ALIGN_PARAGRAPH def create_report(data): doc = Document() # Title title = doc.add_heading('Monthly Report', level=0) title.alignment = WD_ALIGN_PARAGRAPH.CENTER # Subtitle with styling subtitle = doc.add_paragraph() run = subtitle.add_run(f"Period: {data['month']} {data['year']}") run.font.size = Pt(14) run.font.color.rgb = RGBColor(100, 100, 100) subtitle.alignment = WD_ALIGN_PARAGRAPH.CENTER doc.add_paragraph() # Spacer # Summary section doc.add_heading('Executive Summary', level=1) doc.add_paragraph(data['summary']) # Metrics table doc.add_heading('Key Metrics', level=1) table = doc.add_table(rows=1, cols=3) table.style = 'Light Grid Accent 1' header_cells = table.rows[0].cells header_cells[0].text = 'Metric' header_cells[1].text = 'Value' header_cells[2].text = 'Change' for metric in data['metrics']: row = table.add_row().cells row[0].text = metric['name'] row[1].text = str(metric['value']) row[2].text = f"{metric['change']:+.1f}%" # Chart placeholder image if data.get('chart_path'): doc.add_heading('Trend Analysis', level=1) doc.add_picture(data['chart_path'], width=Inches(6)) # Save filename = f"report_{data['month']}_{data['year']}.docx" doc.save(filename) return filename
Core Concepts
Document Structure
| Element | Method | Usage |
|---|---|---|
| Heading | add_heading(text, level) | Section headers (0-9) |
| Paragraph | add_paragraph(text) | Body text |
| Table | add_table(rows, cols) | Tabular data |
| Image | add_picture(path, width) | Embedded images |
| Page Break | add_page_break() | Force new page |
| List | add_paragraph(style='List') | Bulleted/numbered lists |
| Header/Footer | section.header/footer | Page headers/footers |
Template-Based Generation
from docx import Document import re class DocTemplate: def __init__(self, template_path): self.doc = Document(template_path) def fill(self, data): """Replace {{placeholders}} with data values""" for paragraph in self.doc.paragraphs: self._replace_in_paragraph(paragraph, data) for table in self.doc.tables: for row in table.rows: for cell in row.cells: for paragraph in cell.paragraphs: self._replace_in_paragraph(paragraph, data) def _replace_in_paragraph(self, paragraph, data): full_text = paragraph.text for key, value in data.items(): placeholder = f'{{{{{key}}}}}' if placeholder in full_text: for run in paragraph.runs: if placeholder in run.text: run.text = run.text.replace(placeholder, str(value)) def add_dynamic_table(self, bookmark, headers, rows): """Insert a table at a bookmark location""" table = self.doc.add_table(rows=1, cols=len(headers)) table.style = 'Light Grid Accent 1' for i, header in enumerate(headers): table.rows[0].cells[i].text = header for row_data in rows: row = table.add_row().cells for i, value in enumerate(row_data): row[i].text = str(value) def save(self, output_path): self.doc.save(output_path)
Configuration
| Option | Description | Default |
|---|---|---|
template_path | Path to Word template file | None |
output_format | Output format: docx, pdf, html | "docx" |
page_size | Page dimensions: letter, A4 | "letter" |
margins | Page margins in inches | 1.0 all sides |
default_font | Default font family | "Calibri" |
font_size | Default font size in points | 11 |
line_spacing | Paragraph line spacing | 1.15 |
include_toc | Generate table of contents | false |
Best Practices
- Use template documents with placeholder markers instead of building documents from scratch — templates preserve formatting, headers, footers, and styles that are tedious to recreate programmatically
- Preserve run-level formatting when replacing text by replacing within individual runs rather than reconstructing paragraphs, so bold, italic, and color formatting from the template is maintained
- Use consistent placeholder syntax like
{{variable_name}}that is unlikely to appear in normal document text and is easy to find with regex - Handle images with proper aspect ratios by always specifying either width or height (not both) when inserting images to prevent distortion
- Convert to PDF for distribution using LibreOffice in headless mode (
libreoffice --headless --convert-to pdf) rather than distributing editable .docx files when the recipient should not modify the content
Common Issues
Placeholder text split across multiple runs: Word splits text into separate XML runs for formatting reasons, so {{name}} might become {{, na, me}} in three runs. Join all run text in a paragraph, replace placeholders in the joined string, then redistribute text back into runs preserving formatting.
Table formatting lost after adding rows: Dynamically added rows do not inherit the table style's cell formatting. Apply formatting explicitly to each cell in new rows, or copy formatting from the template row before adding data.
Complex layouts breaking with content length changes: Long replacement text can overflow table cells or push content to unexpected pages. Set table column widths explicitly, enable cell auto-fit, and test templates with both minimum and maximum expected content lengths.
Reviews
No reviews yet. Be the first to review this template!
Similar Templates
Full-Stack Code Reviewer
Comprehensive code review skill that checks for security vulnerabilities, performance issues, accessibility, and best practices across frontend and backend code.
Test Suite Generator
Generates comprehensive test suites with unit tests, integration tests, and edge cases. Supports Jest, Vitest, Pytest, and Go testing.
Pro Architecture Workspace
Battle-tested skill for architectural, decision, making, framework. Includes structured workflows, validation checks, and reusable patterns for development.