T

Tavily Web Complete

Boost productivity using this search, content, extraction, crawling. Includes structured workflows, validation checks, and reusable patterns for web development.

SkillClipticsweb developmentv1.0.0MIT
0 views0 copies

Tavily Web Search

A web search and research skill leveraging the Tavily API for AI-optimized web search, content extraction, and research automation with structured results designed for LLM consumption.

When to Use

Choose Tavily Web Search when:

  • Performing AI-optimized web searches that return LLM-ready content
  • Building research agents that need factual, up-to-date web information
  • Automating competitive analysis and market research with web data
  • Extracting clean content from search results without scraping infrastructure

Consider alternatives when:

  • Simple keyword search is enough — use a standard search API
  • Searching your own knowledge base — use a vector database
  • Monitoring specific URLs for changes — use a dedicated monitoring tool

Quick Start

# Install Tavily SDK pip install tavily-python # or npm install @tavily/core
from tavily import TavilyClient client = TavilyClient(api_key="YOUR_TAVILY_API_KEY") # Basic search results = client.search( query="best practices for microservice architecture 2024", search_depth="advanced", max_results=10 ) for result in results["results"]: print(f"Title: {result['title']}") print(f"URL: {result['url']}") print(f"Content: {result['content'][:200]}") print() # Search with content extraction detailed = client.search( query="React server components tutorial", search_depth="advanced", include_raw_content=True, max_results=5 ) # Get search context for AI (summarized) context = client.get_search_context( query="What are the latest developments in quantum computing?", max_tokens=4000 ) print(context) # Pre-formatted context ready for LLM prompts # Quick answer (AI-generated from search results) answer = client.qna_search( query="What is the current market cap of NVIDIA?" ) print(answer)
import { tavily } from '@tavily/core'; const client = tavily({ apiKey: process.env.TAVILY_API_KEY }); async function researchTopic(topic: string) { const response = await client.search(topic, { searchDepth: 'advanced', maxResults: 10, includeAnswer: true, includeDomains: ['arxiv.org', 'github.com', 'medium.com'] }); return { answer: response.answer, sources: response.results.map(r => ({ title: r.title, url: r.url, snippet: r.content })) }; }

Core Concepts

Search Modes

ModeMethodUse CaseCost
Basicsearch()Quick factual lookupsLow
Advancedsearch(depth="advanced")In-depth researchMedium
Contextget_search_context()LLM prompt augmentationMedium
QnAqna_search()Direct answer extractionMedium

Research Agent Pattern

class ResearchAgent: def __init__(self, tavily_client): self.client = tavily_client self.research_data = [] def multi_query_research(self, topic, num_queries=5): """Generate and execute multiple search queries for thorough research""" queries = self._generate_queries(topic) for query in queries[:num_queries]: results = self.client.search( query=query, search_depth="advanced", max_results=5, include_raw_content=True ) for r in results.get("results", []): self.research_data.append({ "query": query, "title": r["title"], "url": r["url"], "content": r.get("raw_content", r["content"]), "relevance_score": r.get("score", 0) }) # Deduplicate by URL seen = set() unique = [] for item in self.research_data: if item["url"] not in seen: seen.add(item["url"]) unique.append(item) self.research_data = unique return self.research_data def _generate_queries(self, topic): """Generate diverse search queries for a topic""" return [ topic, f"{topic} best practices 2024", f"{topic} challenges and solutions", f"{topic} case studies", f"{topic} comparison alternatives" ] def get_context_string(self, max_tokens=4000): """Format research data as context for LLM""" return self.client.get_search_context( query=self.research_data[0]["query"] if self.research_data else "", max_tokens=max_tokens )

Configuration

OptionDescriptionDefault
api_keyTavily API keyRequired
search_depthSearch thoroughness: basic, advanced"basic"
max_resultsMaximum results per query5
include_answerGenerate AI answer from resultsfalse
include_raw_contentInclude full page contentfalse
include_domainsLimit search to specific domains[]
exclude_domainsExclude specific domains[]
max_tokensToken limit for context mode4000

Best Practices

  1. Use search_depth: "advanced" for research tasks where accuracy matters — advanced search queries more sources and provides more detailed content extraction than basic mode
  2. Use get_search_context for RAG pipelines instead of raw search results — it returns pre-formatted, deduplicated text optimized for LLM context windows with proper source attribution
  3. Filter with include_domains when you need authoritative sources — limiting searches to domains like arxiv.org, github.com, or .gov improves result quality for academic and technical research
  4. Cache search results when developing research agents to avoid redundant API calls during testing and iteration — Tavily charges per search, and development workflows often repeat similar queries
  5. Combine qna_search with detailed search by first using QnA for a quick answer, then running a detailed search to verify the answer and gather supporting evidence from multiple sources

Common Issues

Search results missing recent information: Tavily's search index may not have the very latest content (last few hours). For time-critical queries, add date qualifiers to your search query and cross-reference with a real-time news API for breaking information.

Raw content extraction returning noise: Some pages include navigation, ads, and unrelated sidebar content in the raw extraction. Post-process raw content by looking for the main content section, filtering out common boilerplate patterns, and truncating to a reasonable length.

API rate limits during batch research: Running many concurrent search queries hits rate limits. Implement sequential processing with short delays between queries, batch research tasks into groups, and use the get_search_context method for efficiency when you need summarized content rather than full results.

Community

Reviews

Write a review

No reviews yet. Be the first to review this template!

Similar Templates