Comprehensive Rag Module
Production-ready skill that handles retrieval, augmented, generation, patterns. Includes structured workflows, validation checks, and reusable patterns for ai research.
Comprehensive RAG Module
Full-featured Retrieval-Augmented Generation module covering document processing, multi-modal retrieval, advanced chunking, query understanding, answer synthesis, and production monitoring — designed for enterprise RAG deployments.
When to Use
Deploy this module when:
- Building enterprise RAG systems with multiple document types (PDF, HTML, code, images)
- Need advanced features: multi-hop reasoning, query decomposition, citation tracking
- Require production monitoring of retrieval quality and answer accuracy
- Managing complex document pipelines with versioning and access control
Use simpler RAG when:
- Single document type, small collection → basic vector search
- Prototyping → LangChain with default settings
- Already have a working pipeline that meets quality targets
Quick Start
Multi-Stage RAG Pipeline
from rag_module import RAGPipeline, DocumentProcessor, Retriever, Generator # 1. Document processing with type-aware chunking processor = DocumentProcessor( chunking_strategy="adaptive", # Auto-selects based on doc type chunk_size=1000, chunk_overlap=200, extract_tables=True, extract_images=True, preserve_hierarchy=True, ) chunks = processor.process_directory("./documents/") # 2. Multi-strategy retrieval retriever = Retriever( primary="semantic", secondary="bm25", reranker="cross-encoder", top_k=10, rerank_top_k=5, ) # 3. Answer generation with citations generator = Generator( model="claude-sonnet-4-20250514", citation_mode="inline", max_context_tokens=8000, faithfulness_check=True, ) # 4. Assemble pipeline pipeline = RAGPipeline( processor=processor, retriever=retriever, generator=generator, ) answer = pipeline.query("What are the compliance requirements for Q3?") print(answer.text) print(answer.citations) print(answer.confidence)
Query Understanding
from rag_module import QueryAnalyzer analyzer = QueryAnalyzer() # Automatically decomposes complex queries result = analyzer.analyze("Compare the Q1 and Q2 revenue for Product A and Product B") # result.type = "comparative" # result.sub_queries = [ # "Q1 revenue for Product A", # "Q2 revenue for Product A", # "Q1 revenue for Product B", # "Q2 revenue for Product B" # ] # result.aggregation = "comparison_table"
Core Concepts
Pipeline Architecture
Query → Query Understanding → Sub-Query Decomposition
|
[parallel retrieval]
|
Context Assembly → Dedup + Rerank
|
Answer Generation → Citation Extraction
|
Quality Check → Confidence Score
Document Processing
| Document Type | Processor | Chunking | Special Handling |
|---|---|---|---|
| PyMuPDF | Section-aware | Table extraction, OCR | |
| HTML | BeautifulSoup | Tag-based | Strip navigation, ads |
| Markdown | Custom | Header-based | Preserve code blocks |
| Code | Tree-sitter | AST-based | Function/class units |
| Images | Vision model | Description | Caption generation |
| Spreadsheets | Pandas | Row-group | Schema preservation |
Query Types and Strategies
| Query Type | Retrieval Strategy | Generation Approach |
|---|---|---|
| Factual | Direct retrieval, top 3-5 | Extract and cite |
| Analytical | Broad retrieval, top 10 | Synthesize and compare |
| Multi-hop | Iterative retrieval | Chain reasoning steps |
| Comparative | Parallel retrieval per entity | Structured table output |
| Temporal | Date-filtered retrieval | Timeline-aware synthesis |
Configuration
| Component | Parameter | Default | Description |
|---|---|---|---|
| Processor | chunk_size | 1000 | Characters per chunk |
| Processor | extract_tables | True | Parse tables from PDFs |
| Retriever | primary | "semantic" | Main retrieval method |
| Retriever | reranker | "cross-encoder" | Reranking model |
| Retriever | top_k | 10 | Initial retrieval count |
| Generator | citation_mode | "inline" | Citation format |
| Generator | faithfulness_check | True | Verify grounding |
| Pipeline | confidence_threshold | 0.7 | Minimum confidence |
Best Practices
- Type-aware chunking — PDFs, code, and markdown each need different chunking strategies
- Decompose complex queries — multi-hop and comparative queries need sub-query decomposition
- Always rerank — initial retrieval gets recall, reranking gets precision
- Track citations — every claim should map to a source chunk for verifiability
- Monitor continuously — retrieval quality degrades as document collections change
- Handle multi-modal content — extract and index tables, images, and structured data separately
Common Issues
Multi-hop queries return incomplete answers: Enable query decomposition. Increase retrieval top_k for broader coverage. Use iterative retrieval where later queries build on earlier results.
Table data not retrieved correctly: Ensure table extraction is enabled in document processing. Index table data with schema context (column headers). Use structured retrieval for table queries.
Confidence scores unreliable: Calibrate against human judgments on a labeled dataset. Combine multiple signals: retrieval score, generation perplexity, and source overlap.
Reviews
No reviews yet. Be the first to review this template!
Similar Templates
Full-Stack Code Reviewer
Comprehensive code review skill that checks for security vulnerabilities, performance issues, accessibility, and best practices across frontend and backend code.
Test Suite Generator
Generates comprehensive test suites with unit tests, integration tests, and edge cases. Supports Jest, Vitest, Pytest, and Go testing.
Pro Architecture Workspace
Battle-tested skill for architectural, decision, making, framework. Includes structured workflows, validation checks, and reusable patterns for development.