Ultimate Rag Qdrant
All-in-one skill covering high, performance, vector, similarity. Includes structured workflows, validation checks, and reusable patterns for ai research.
Ultimate RAG with Qdrant
Build production RAG systems using Qdrant, a high-performance vector database written in Rust — with rich filtering, horizontal scaling, and on-premise deployment options.
When to Use
Choose Qdrant when:
- Need high-performance vector search with rich metadata filtering
- Want on-premise deployment with full data control
- Require horizontal scaling with sharding and replication
- Building production systems that need both cloud and self-hosted options
- Need advanced filtering (geo, range, nested objects)
Consider alternatives when:
- Want fully managed with zero ops → Pinecone
- Pure vector search at billion scale → FAISS
- Already using Postgres → pgvector
- Multi-modal with GraphQL → Weaviate
Quick Start
Installation
# Python client pip install qdrant-client # Run Qdrant locally (Docker) docker run -p 6333:6333 qdrant/qdrant
Create Collection and Insert
from qdrant_client import QdrantClient from qdrant_client.models import VectorParams, Distance, PointStruct client = QdrantClient("http://localhost:6333") # Create collection client.create_collection( collection_name="documents", vectors_config=VectorParams(size=1536, distance=Distance.COSINE) ) # Insert vectors with payload (metadata) points = [ PointStruct( id=1, vector=embedding_vector, payload={ "source": "handbook.pdf", "page": 12, "section": "refund-policy", "text": "Refunds are processed within 5-7 business days...", "tags": ["policy", "refunds"], "date": "2024-01-15" } ), # ... more points ] client.upsert(collection_name="documents", points=points)
Search with Filtering
from qdrant_client.models import Filter, FieldCondition, MatchValue, Range results = client.search( collection_name="documents", query_vector=query_embedding, limit=5, query_filter=Filter( must=[ FieldCondition(key="source", match=MatchValue(value="handbook.pdf")), FieldCondition(key="page", range=Range(gte=10, lte=50)), ] ) ) for point in results: print(f"Score: {point.score:.4f}") print(f"Text: {point.payload['text']}")
Core Concepts
Payload Filtering
Qdrant supports rich filtering on payload (metadata) fields:
| Filter Type | Example | Use Case |
|---|---|---|
| Match | MatchValue(value="finance") | Exact match |
| Range | Range(gte=10, lte=50) | Numeric ranges |
| Geo | GeoBoundingBox(...) | Location-based |
| Full-text | MatchText(text="refund") | Keyword in text |
| Nested | Nested(key="tags", ...) | Array/object fields |
| Is Empty | IsEmpty(key="notes") | Null check |
Index Optimization
# Create payload indexes for filtered fields client.create_payload_index( collection_name="documents", field_name="source", field_schema="keyword" ) client.create_payload_index( collection_name="documents", field_name="date", field_schema="datetime" ) # HNSW parameters client.update_collection( collection_name="documents", hnsw_config={"m": 16, "ef_construct": 200} )
Horizontal Scaling
# Qdrant cluster configuration storage: storage_path: /data/qdrant cluster: enabled: true consensus: tick_period_ms: 100 collections: documents: shard_number: 6 # Distribute across nodes replication_factor: 2 # Data redundancy
Configuration
| Parameter | Default | Description |
|---|---|---|
size | — | Vector dimension |
distance | COSINE | Similarity metric (Cosine, Euclid, Dot) |
hnsw_m | 16 | HNSW graph connectivity |
hnsw_ef_construct | 100 | HNSW build quality |
shard_number | 1 | Data partitions |
replication_factor | 1 | Data copies |
on_disk | False | Store vectors on disk |
Best Practices
- Create payload indexes for fields you filter on — dramatically improves filtered search
- Use quantization for large collections — scalar quantization reduces memory by 4x
- Set shard_number to at least the number of nodes for even distribution
- Use on_disk mode for collections larger than available RAM
- Batch inserts — upsert in batches of 100-500 for optimal throughput
- Monitor with Qdrant dashboard — built-in UI at port 6333/dashboard
Common Issues
Slow filtered search: Create payload indexes for filtered fields. Pre-filter narrows the search space dramatically. Check that HNSW ef_construct is high enough for your dataset size.
High memory usage:
Enable scalar quantization to reduce memory by 4x. Use on_disk storage for vectors. Reduce HNSW m parameter (trades accuracy for memory).
Cluster node failure:
Set replication_factor >= 2 for fault tolerance. Qdrant automatically handles node recovery. Monitor shard distribution to ensure even data spread.
Reviews
No reviews yet. Be the first to review this template!
Similar Templates
Full-Stack Code Reviewer
Comprehensive code review skill that checks for security vulnerabilities, performance issues, accessibility, and best practices across frontend and backend code.
Test Suite Generator
Generates comprehensive test suites with unit tests, integration tests, and edge cases. Supports Jest, Vitest, Pytest, and Go testing.
Pro Architecture Workspace
Battle-tested skill for architectural, decision, making, framework. Includes structured workflows, validation checks, and reusable patterns for development.