U

Ultimate Rag Qdrant

All-in-one skill covering high, performance, vector, similarity. Includes structured workflows, validation checks, and reusable patterns for ai research.

SkillClipticsai researchv1.0.0MIT
0 views0 copies

Ultimate RAG with Qdrant

Build production RAG systems using Qdrant, a high-performance vector database written in Rust — with rich filtering, horizontal scaling, and on-premise deployment options.

When to Use

Choose Qdrant when:

  • Need high-performance vector search with rich metadata filtering
  • Want on-premise deployment with full data control
  • Require horizontal scaling with sharding and replication
  • Building production systems that need both cloud and self-hosted options
  • Need advanced filtering (geo, range, nested objects)

Consider alternatives when:

  • Want fully managed with zero ops → Pinecone
  • Pure vector search at billion scale → FAISS
  • Already using Postgres → pgvector
  • Multi-modal with GraphQL → Weaviate

Quick Start

Installation

# Python client pip install qdrant-client # Run Qdrant locally (Docker) docker run -p 6333:6333 qdrant/qdrant

Create Collection and Insert

from qdrant_client import QdrantClient from qdrant_client.models import VectorParams, Distance, PointStruct client = QdrantClient("http://localhost:6333") # Create collection client.create_collection( collection_name="documents", vectors_config=VectorParams(size=1536, distance=Distance.COSINE) ) # Insert vectors with payload (metadata) points = [ PointStruct( id=1, vector=embedding_vector, payload={ "source": "handbook.pdf", "page": 12, "section": "refund-policy", "text": "Refunds are processed within 5-7 business days...", "tags": ["policy", "refunds"], "date": "2024-01-15" } ), # ... more points ] client.upsert(collection_name="documents", points=points)

Search with Filtering

from qdrant_client.models import Filter, FieldCondition, MatchValue, Range results = client.search( collection_name="documents", query_vector=query_embedding, limit=5, query_filter=Filter( must=[ FieldCondition(key="source", match=MatchValue(value="handbook.pdf")), FieldCondition(key="page", range=Range(gte=10, lte=50)), ] ) ) for point in results: print(f"Score: {point.score:.4f}") print(f"Text: {point.payload['text']}")

Core Concepts

Payload Filtering

Qdrant supports rich filtering on payload (metadata) fields:

Filter TypeExampleUse Case
MatchMatchValue(value="finance")Exact match
RangeRange(gte=10, lte=50)Numeric ranges
GeoGeoBoundingBox(...)Location-based
Full-textMatchText(text="refund")Keyword in text
NestedNested(key="tags", ...)Array/object fields
Is EmptyIsEmpty(key="notes")Null check

Index Optimization

# Create payload indexes for filtered fields client.create_payload_index( collection_name="documents", field_name="source", field_schema="keyword" ) client.create_payload_index( collection_name="documents", field_name="date", field_schema="datetime" ) # HNSW parameters client.update_collection( collection_name="documents", hnsw_config={"m": 16, "ef_construct": 200} )

Horizontal Scaling

# Qdrant cluster configuration storage: storage_path: /data/qdrant cluster: enabled: true consensus: tick_period_ms: 100 collections: documents: shard_number: 6 # Distribute across nodes replication_factor: 2 # Data redundancy

Configuration

ParameterDefaultDescription
sizeVector dimension
distanceCOSINESimilarity metric (Cosine, Euclid, Dot)
hnsw_m16HNSW graph connectivity
hnsw_ef_construct100HNSW build quality
shard_number1Data partitions
replication_factor1Data copies
on_diskFalseStore vectors on disk

Best Practices

  1. Create payload indexes for fields you filter on — dramatically improves filtered search
  2. Use quantization for large collections — scalar quantization reduces memory by 4x
  3. Set shard_number to at least the number of nodes for even distribution
  4. Use on_disk mode for collections larger than available RAM
  5. Batch inserts — upsert in batches of 100-500 for optimal throughput
  6. Monitor with Qdrant dashboard — built-in UI at port 6333/dashboard

Common Issues

Slow filtered search: Create payload indexes for filtered fields. Pre-filter narrows the search space dramatically. Check that HNSW ef_construct is high enough for your dataset size.

High memory usage: Enable scalar quantization to reduce memory by 4x. Use on_disk storage for vectors. Reduce HNSW m parameter (trades accuracy for memory).

Cluster node failure: Set replication_factor >= 2 for fault tolerance. Qdrant automatically handles node recovery. Monitor shard distribution to ensure even data spread.

Community

Reviews

Write a review

No reviews yet. Be the first to review this template!

Similar Templates