D

Dynamic Word Document Processing Studio

A comprehensive skill that enables create, edit, and analyze documents with tracked changes. Built for Claude Code with best practices and real-world patterns.

SkillCommunitydevelopmentv1.0.0MIT
0 views0 copies

Word Document Processing

A document automation skill for programmatically creating, reading, modifying, and converting Microsoft Word documents using libraries like docx, python-docx, and mammoth.

When to Use

Choose Word Document Processing when:

  • Generating reports, invoices, or contracts programmatically from templates
  • Extracting text and metadata from Word documents for processing
  • Converting Word documents to PDF, HTML, or plain text
  • Automating mail merge and document assembly workflows

Consider alternatives when:

  • Creating simple text documents — use plain text or Markdown
  • Building PDF-first documents — use a PDF generation library directly
  • Interactive document editing — use Google Docs API or Office 365 API

Quick Start

# Python pip install python-docx mammoth # Node.js npm install docx mammoth
from docx import Document from docx.shared import Inches, Pt, RGBColor from docx.enum.text import WD_ALIGN_PARAGRAPH def create_report(data): doc = Document() # Title title = doc.add_heading('Monthly Report', level=0) title.alignment = WD_ALIGN_PARAGRAPH.CENTER # Subtitle with styling subtitle = doc.add_paragraph() run = subtitle.add_run(f"Period: {data['month']} {data['year']}") run.font.size = Pt(14) run.font.color.rgb = RGBColor(100, 100, 100) subtitle.alignment = WD_ALIGN_PARAGRAPH.CENTER doc.add_paragraph() # Spacer # Summary section doc.add_heading('Executive Summary', level=1) doc.add_paragraph(data['summary']) # Metrics table doc.add_heading('Key Metrics', level=1) table = doc.add_table(rows=1, cols=3) table.style = 'Light Grid Accent 1' header_cells = table.rows[0].cells header_cells[0].text = 'Metric' header_cells[1].text = 'Value' header_cells[2].text = 'Change' for metric in data['metrics']: row = table.add_row().cells row[0].text = metric['name'] row[1].text = str(metric['value']) row[2].text = f"{metric['change']:+.1f}%" # Chart placeholder image if data.get('chart_path'): doc.add_heading('Trend Analysis', level=1) doc.add_picture(data['chart_path'], width=Inches(6)) # Save filename = f"report_{data['month']}_{data['year']}.docx" doc.save(filename) return filename

Core Concepts

Document Structure

ElementMethodUsage
Headingadd_heading(text, level)Section headers (0-9)
Paragraphadd_paragraph(text)Body text
Tableadd_table(rows, cols)Tabular data
Imageadd_picture(path, width)Embedded images
Page Breakadd_page_break()Force new page
Listadd_paragraph(style='List')Bulleted/numbered lists
Header/Footersection.header/footerPage headers/footers

Template-Based Generation

from docx import Document import re class DocTemplate: def __init__(self, template_path): self.doc = Document(template_path) def fill(self, data): """Replace {{placeholders}} with data values""" for paragraph in self.doc.paragraphs: self._replace_in_paragraph(paragraph, data) for table in self.doc.tables: for row in table.rows: for cell in row.cells: for paragraph in cell.paragraphs: self._replace_in_paragraph(paragraph, data) def _replace_in_paragraph(self, paragraph, data): full_text = paragraph.text for key, value in data.items(): placeholder = f'{{{{{key}}}}}' if placeholder in full_text: for run in paragraph.runs: if placeholder in run.text: run.text = run.text.replace(placeholder, str(value)) def add_dynamic_table(self, bookmark, headers, rows): """Insert a table at a bookmark location""" table = self.doc.add_table(rows=1, cols=len(headers)) table.style = 'Light Grid Accent 1' for i, header in enumerate(headers): table.rows[0].cells[i].text = header for row_data in rows: row = table.add_row().cells for i, value in enumerate(row_data): row[i].text = str(value) def save(self, output_path): self.doc.save(output_path)

Configuration

OptionDescriptionDefault
template_pathPath to Word template fileNone
output_formatOutput format: docx, pdf, html"docx"
page_sizePage dimensions: letter, A4"letter"
marginsPage margins in inches1.0 all sides
default_fontDefault font family"Calibri"
font_sizeDefault font size in points11
line_spacingParagraph line spacing1.15
include_tocGenerate table of contentsfalse

Best Practices

  1. Use template documents with placeholder markers instead of building documents from scratch — templates preserve formatting, headers, footers, and styles that are tedious to recreate programmatically
  2. Preserve run-level formatting when replacing text by replacing within individual runs rather than reconstructing paragraphs, so bold, italic, and color formatting from the template is maintained
  3. Use consistent placeholder syntax like {{variable_name}} that is unlikely to appear in normal document text and is easy to find with regex
  4. Handle images with proper aspect ratios by always specifying either width or height (not both) when inserting images to prevent distortion
  5. Convert to PDF for distribution using LibreOffice in headless mode (libreoffice --headless --convert-to pdf) rather than distributing editable .docx files when the recipient should not modify the content

Common Issues

Placeholder text split across multiple runs: Word splits text into separate XML runs for formatting reasons, so {{name}} might become {{, na, me}} in three runs. Join all run text in a paragraph, replace placeholders in the joined string, then redistribute text back into runs preserving formatting.

Table formatting lost after adding rows: Dynamically added rows do not inherit the table style's cell formatting. Apply formatting explicitly to each cell in new rows, or copy formatting from the template row before adding data.

Complex layouts breaking with content length changes: Long replacement text can overflow table cells or push content to unexpected pages. Set table column widths explicitly, enable cell auto-fit, and test templates with both minimum and maximum expected content lengths.

Community

Reviews

Write a review

No reviews yet. Be the first to review this template!

Similar Templates