Search Specialist Companion

An autonomous agent that designs and optimizes search systems — implementing full-text search, semantic search, faceted filtering, and hybrid retrieval with relevance tuning across Elasticsearch, Typesense, Meilisearch, and vector databases.

When to Use This Agent

Choose Search Specialist Companion when:

You need to implement search functionality with relevant, fast results
Existing search returns irrelevant results and needs relevance tuning
You want to add semantic search or hybrid search capabilities
You need faceted filtering, autocomplete, or search analytics

Consider alternatives when:

You need a simple SQL LIKE query on a small dataset (use basic database query)
Your search is purely vector-based RAG retrieval (use an LLM architect agent)
You need web scraping or crawling (use a web scraping tool)

Quick Start


# .claude/agents/search-specialist.yml
name: search-specialist-companion
description: Design and optimize search systems
agent_prompt: |
  You are a Search Specialist. When building search systems:

  1. Analyze the data schema, query patterns, and user expectations
  2. Choose the right search engine for the use case
  3. Design index schema with proper analyzers and mappings
  4. Implement relevance tuning (boost fields, decay functions, synonyms)
  5. Add user-facing features (autocomplete, facets, highlights)
  6. Set up search analytics to measure and improve quality

  Search quality priorities: relevance > speed > features.

Example invocation:


claude "Our product search returns irrelevant results. Users search for 'red shoes size 9' and get blue handbags. Fix the search relevance."

Sample output:

Search Relevance Audit — Product Search
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Current Issues:
  1. All fields weighted equally (title = description = tags)
  2. No attribute filtering (color, size treated as text search)
  3. No synonym handling ("sneakers" ≠ "tennis shoes")
  4. BM25 scoring without field boosting

Fixes Applied:
  1. Field boosting: title^3, category^2, tags^1.5, description^1
  2. Structured filters: color, size, brand as faceted filters
  3. Synonym dictionary: sneakers↔tennis shoes, purse↔handbag
  4. Query parsing: "red shoes size 9" →
       filter: {color: "red", size: "9"}
       text: "shoes"

Before: "red shoes size 9" → 45% relevant in top 10
After:  "red shoes size 9" → 92% relevant in top 10

Core Concepts

Search Engine Selection

Engine	Best For	Scale	Hosting
Elasticsearch	Enterprise, complex queries	Billions of docs	Self-hosted / Elastic Cloud
Typesense	Speed + simplicity	Millions of docs	Self-hosted / Typesense Cloud
Meilisearch	Developer experience, typo tolerance	Millions of docs	Self-hosted / Meilisearch Cloud
pgvector + tsvector	Existing Postgres, hybrid	Millions of docs	Existing Postgres
Algolia	Instant search, SaaS	Millions of records	Fully managed

Hybrid Search Architecture


// Combine keyword + semantic search for best relevance
class HybridSearchEngine {
  constructor(
    private textSearch: TextSearchClient,  // Elasticsearch/Typesense
    private vectorSearch: VectorSearchClient,  // Pinecone/pgvector
    private embedder: EmbeddingModel
  ) {}

  async search(query: string, options: SearchOptions) {
    // 1. Run both searches in parallel
    const [textResults, vectorResults] = await Promise.all([
      this.textSearch.search(query, {
        fields: options.searchFields,
        filters: options.filters,
        limit: 50
      }),
      this.vectorSearch.search(
        await this.embedder.embed(query),
        { limit: 50, filter: options.filters }
      )
    ]);

    // 2. Reciprocal Rank Fusion (RRF) to merge results
    const merged = this.reciprocalRankFusion(textResults, vectorResults, {
      textWeight: 0.6,
      vectorWeight: 0.4
    });

    return merged.slice(0, options.limit || 20);
  }

  private reciprocalRankFusion(textResults, vectorResults, weights) {
    const scores = new Map();
    const k = 60; // RRF constant

    textResults.forEach((doc, rank) => {
      const score = weights.textWeight * (1 / (k + rank));
      scores.set(doc.id, (scores.get(doc.id) || 0) + score);
    });

    vectorResults.forEach((doc, rank) => {
      const score = weights.vectorWeight * (1 / (k + rank));
      scores.set(doc.id, (scores.get(doc.id) || 0) + score);
    });

    return [...scores.entries()]
      .sort((a, b) => b[1] - a[1])
      .map(([id, score]) => ({ id, score }));
  }
}

Configuration

Option	Type	Default	Description
`engine`	string	`"elasticsearch"`	Search engine: elasticsearch, typesense, meilisearch
`hybridSearch`	boolean	`false`	Enable semantic + keyword hybrid search
`autocomplete`	boolean	`true`	Enable search-as-you-type suggestions
`facets`	string[]	`[]`	Fields to use as faceted filters
`synonyms`	object	`{}`	Synonym dictionary for query expansion
`analyticsEnabled`	boolean	`true`	Track search queries and click-through rates

Best Practices

Weight title matches 3x higher than description matches — Users expect results where the title matches their query to appear first. If a product is titled "Red Running Shoes" and the query is "red shoes," this should outrank a product whose description mentions "red shoes" but is titled "Athletic Footwear."
Separate structured attributes from text search — Color, size, price, and category should be filters (exact match), not text search fields. When a user searches "red shoes size 9," parse the structured attributes into filters and only text-search the remaining terms. This eliminates the "blue handbags" problem.
Implement search analytics from day one — Track every search query, the results shown, and what users click. Queries with no clicks indicate poor results. Queries with clicks on result #8 indicate ranking problems. This data is essential for relevance tuning and cannot be recreated retroactively.
Add typo tolerance and stemming — "runnng shoes" and "running shoe" should return the same results. Most search engines support fuzzy matching and stemming out of the box, but they are often not enabled by default. Configure them early to avoid user frustration.
Use reciprocal rank fusion for hybrid search — When combining keyword and semantic search results, RRF produces better results than simple score averaging because it is robust to score scale differences between systems. Weight keyword results slightly higher (0.6) for e-commerce and factual queries.

Common Issues

Search returns too many irrelevant results — The default BM25 scoring matches any document containing any query term, flooding results with loosely related items. Increase the minimum match threshold (e.g., require 75% of query terms to match), add field boosting to prioritize title matches, and implement re-ranking based on business metrics (popularity, recency).

Autocomplete suggestions are slow or irrelevant — The autocomplete queries the full index, which is slow and returns the same results as regular search. Create a dedicated autocomplete index with edge-ngram tokenization on titles and popular query logs. This index should be optimized for prefix matching speed, not full-text relevance.

Search performance degrades as index grows — Queries that took 50ms on 100K documents now take 2 seconds on 10M documents. Add index sharding, implement result caching for popular queries (60-80% of searches are repeated queries), and use filter-first strategies that narrow the document set before applying expensive text scoring.

⚠️ Loading Issue

Search Specialist Companion

Search Specialist Companion

When to Use This Agent

Quick Start

Core Concepts

Search Engine Selection

Hybrid Search Architecture

Configuration

Best Practices

Common Issues

Reviews

Write a review

Similar Templates

API Endpoint Builder

Documentation Auto-Generator

Ai Ethics Advisor Partner