S

Search Specialist Companion

Powerful agent for expert, researcher, using, advanced. Includes structured workflows, validation checks, and reusable patterns for ai specialists.

AgentClipticsai specialistsv1.0.0MIT
0 views0 copies

Search Specialist Companion

An autonomous agent that designs and optimizes search systems β€” implementing full-text search, semantic search, faceted filtering, and hybrid retrieval with relevance tuning across Elasticsearch, Typesense, Meilisearch, and vector databases.

When to Use This Agent

Choose Search Specialist Companion when:

  • You need to implement search functionality with relevant, fast results
  • Existing search returns irrelevant results and needs relevance tuning
  • You want to add semantic search or hybrid search capabilities
  • You need faceted filtering, autocomplete, or search analytics

Consider alternatives when:

  • You need a simple SQL LIKE query on a small dataset (use basic database query)
  • Your search is purely vector-based RAG retrieval (use an LLM architect agent)
  • You need web scraping or crawling (use a web scraping tool)

Quick Start

# .claude/agents/search-specialist.yml name: search-specialist-companion description: Design and optimize search systems agent_prompt: | You are a Search Specialist. When building search systems: 1. Analyze the data schema, query patterns, and user expectations 2. Choose the right search engine for the use case 3. Design index schema with proper analyzers and mappings 4. Implement relevance tuning (boost fields, decay functions, synonyms) 5. Add user-facing features (autocomplete, facets, highlights) 6. Set up search analytics to measure and improve quality Search quality priorities: relevance > speed > features.

Example invocation:

claude "Our product search returns irrelevant results. Users search for 'red shoes size 9' and get blue handbags. Fix the search relevance."

Sample output:

Search Relevance Audit β€” Product Search
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Current Issues:
  1. All fields weighted equally (title = description = tags)
  2. No attribute filtering (color, size treated as text search)
  3. No synonym handling ("sneakers" β‰  "tennis shoes")
  4. BM25 scoring without field boosting

Fixes Applied:
  1. Field boosting: title^3, category^2, tags^1.5, description^1
  2. Structured filters: color, size, brand as faceted filters
  3. Synonym dictionary: sneakers↔tennis shoes, purse↔handbag
  4. Query parsing: "red shoes size 9" β†’
       filter: {color: "red", size: "9"}
       text: "shoes"

Before: "red shoes size 9" β†’ 45% relevant in top 10
After:  "red shoes size 9" β†’ 92% relevant in top 10

Core Concepts

Search Engine Selection

EngineBest ForScaleHosting
ElasticsearchEnterprise, complex queriesBillions of docsSelf-hosted / Elastic Cloud
TypesenseSpeed + simplicityMillions of docsSelf-hosted / Typesense Cloud
MeilisearchDeveloper experience, typo toleranceMillions of docsSelf-hosted / Meilisearch Cloud
pgvector + tsvectorExisting Postgres, hybridMillions of docsExisting Postgres
AlgoliaInstant search, SaaSMillions of recordsFully managed

Hybrid Search Architecture

// Combine keyword + semantic search for best relevance class HybridSearchEngine { constructor( private textSearch: TextSearchClient, // Elasticsearch/Typesense private vectorSearch: VectorSearchClient, // Pinecone/pgvector private embedder: EmbeddingModel ) {} async search(query: string, options: SearchOptions) { // 1. Run both searches in parallel const [textResults, vectorResults] = await Promise.all([ this.textSearch.search(query, { fields: options.searchFields, filters: options.filters, limit: 50 }), this.vectorSearch.search( await this.embedder.embed(query), { limit: 50, filter: options.filters } ) ]); // 2. Reciprocal Rank Fusion (RRF) to merge results const merged = this.reciprocalRankFusion(textResults, vectorResults, { textWeight: 0.6, vectorWeight: 0.4 }); return merged.slice(0, options.limit || 20); } private reciprocalRankFusion(textResults, vectorResults, weights) { const scores = new Map(); const k = 60; // RRF constant textResults.forEach((doc, rank) => { const score = weights.textWeight * (1 / (k + rank)); scores.set(doc.id, (scores.get(doc.id) || 0) + score); }); vectorResults.forEach((doc, rank) => { const score = weights.vectorWeight * (1 / (k + rank)); scores.set(doc.id, (scores.get(doc.id) || 0) + score); }); return [...scores.entries()] .sort((a, b) => b[1] - a[1]) .map(([id, score]) => ({ id, score })); } }

Configuration

OptionTypeDefaultDescription
enginestring"elasticsearch"Search engine: elasticsearch, typesense, meilisearch
hybridSearchbooleanfalseEnable semantic + keyword hybrid search
autocompletebooleantrueEnable search-as-you-type suggestions
facetsstring[][]Fields to use as faceted filters
synonymsobject{}Synonym dictionary for query expansion
analyticsEnabledbooleantrueTrack search queries and click-through rates

Best Practices

  1. Weight title matches 3x higher than description matches β€” Users expect results where the title matches their query to appear first. If a product is titled "Red Running Shoes" and the query is "red shoes," this should outrank a product whose description mentions "red shoes" but is titled "Athletic Footwear."

  2. Separate structured attributes from text search β€” Color, size, price, and category should be filters (exact match), not text search fields. When a user searches "red shoes size 9," parse the structured attributes into filters and only text-search the remaining terms. This eliminates the "blue handbags" problem.

  3. Implement search analytics from day one β€” Track every search query, the results shown, and what users click. Queries with no clicks indicate poor results. Queries with clicks on result #8 indicate ranking problems. This data is essential for relevance tuning and cannot be recreated retroactively.

  4. Add typo tolerance and stemming β€” "runnng shoes" and "running shoe" should return the same results. Most search engines support fuzzy matching and stemming out of the box, but they are often not enabled by default. Configure them early to avoid user frustration.

  5. Use reciprocal rank fusion for hybrid search β€” When combining keyword and semantic search results, RRF produces better results than simple score averaging because it is robust to score scale differences between systems. Weight keyword results slightly higher (0.6) for e-commerce and factual queries.

Common Issues

Search returns too many irrelevant results β€” The default BM25 scoring matches any document containing any query term, flooding results with loosely related items. Increase the minimum match threshold (e.g., require 75% of query terms to match), add field boosting to prioritize title matches, and implement re-ranking based on business metrics (popularity, recency).

Autocomplete suggestions are slow or irrelevant β€” The autocomplete queries the full index, which is slow and returns the same results as regular search. Create a dedicated autocomplete index with edge-ngram tokenization on titles and popular query logs. This index should be optimized for prefix matching speed, not full-text relevance.

Search performance degrades as index grows β€” Queries that took 50ms on 100K documents now take 2 seconds on 10M documents. Add index sharding, implement result caching for popular queries (60-80% of searches are repeated queries), and use filter-first strategies that narrow the document set before applying expensive text scoring.

Community

Reviews

Write a review

No reviews yet. Be the first to review this template!

Similar Templates