Exa Workspace

A semantic search and knowledge retrieval skill for using the Exa API to find, extract, and analyze web content with neural search capabilities for research and data collection.

When to Use

Choose Exa Workspace when:

Performing semantic web searches that understand meaning beyond keyword matching
Extracting clean content from web pages for research and analysis
Building automated research pipelines with structured data extraction
Finding similar pages to a given URL or content pattern

Consider alternatives when:

Simple keyword search suffices — use a standard search API
Scraping specific known websites — use targeted web scraping tools
Searching your own databases — use internal search engines

Quick Start


# Install Exa SDK
pip install exa-py
# or
npm install exa-js


from exa_py import Exa

exa = Exa(api_key="YOUR_EXA_API_KEY")

# Semantic search — finds results by meaning, not just keywords
results = exa.search(
    "best practices for microservice architecture in production",
    num_results=10,
    use_autoprompt=True,
    type="auto"
)

for r in results.results:
    print(f"{r.title}: {r.url}")

# Search and get full content
results_with_content = exa.search_and_contents(
    "React server components tutorial",
    text=True,
    num_results=5
)

for r in results_with_content.results:
    print(f"\n=== {r.title} ===")
    print(r.text[:500])

# Find similar pages to a URL
similar = exa.find_similar(
    url="https://example-blog.com/great-article",
    num_results=10
)


import Exa from 'exa-js';

const exa = new Exa(process.env.EXA_API_KEY);

async function researchTopic(topic: string) {
  // Search with auto-prompt for better results
  const results = await exa.searchAndContents(topic, {
    numResults: 10,
    useAutoprompt: true,
    text: { maxCharacters: 2000 }
  });

  // Extract and structure findings
  return results.results.map(r => ({
    title: r.title,
    url: r.url,
    summary: r.text?.substring(0, 500),
    publishedDate: r.publishedDate
  }));
}

// Find competing products/companies
async function competitorAnalysis(companyUrl: string) {
  const similar = await exa.findSimilar(companyUrl, {
    numResults: 20,
    excludeSourceDomain: true
  });
  return similar.results;
}

Core Concepts

Exa Search Types

Search Type	Description	Best For
`neural`	Semantic meaning-based search	Research, concept queries
`keyword`	Traditional keyword matching	Exact phrase lookup
`auto`	Exa chooses best approach	General use

Research Pipeline Architecture


class ResearchPipeline:
    def __init__(self, exa_client):
        self.exa = exa_client
        self.findings = []

    def search_phase(self, queries, results_per_query=10):
        """Broad search across multiple queries"""
        for query in queries:
            results = self.exa.search_and_contents(
                query,
                num_results=results_per_query,
                use_autoprompt=True,
                text=True
            )
            for r in results.results:
                self.findings.append({
                    'query': query,
                    'title': r.title,
                    'url': r.url,
                    'content': r.text,
                    'date': r.published_date,
                    'score': r.score
                })

    def find_similar_phase(self, seed_urls):
        """Expand findings with similar content"""
        for url in seed_urls:
            similar = self.exa.find_similar(
                url,
                num_results=5,
                exclude_source_domain=True
            )
            for r in similar.results:
                self.findings.append({
                    'query': f'similar_to:{url}',
                    'title': r.title,
                    'url': r.url,
                    'score': r.score
                })

    def deduplicate(self):
        """Remove duplicate URLs"""
        seen = set()
        unique = []
        for f in self.findings:
            if f['url'] not in seen:
                seen.add(f['url'])
                unique.append(f)
        self.findings = unique

    def rank_by_relevance(self):
        """Sort findings by relevance score"""
        self.findings.sort(key=lambda x: x.get('score', 0), reverse=True)
        return self.findings[:20]

Configuration

Option	Description	Default
`api_key`	Exa API key for authentication	Required
`search_type`	Search mode: neural, keyword, auto	`"auto"`
`num_results`	Number of results per query	`10`
`use_autoprompt`	Let Exa optimize your query	`true`
`include_text`	Extract page text content	`false`
`max_characters`	Maximum text per result	`2000`
`include_domains`	Limit search to specific domains	`[]`
`exclude_domains`	Exclude specific domains	`[]`

Best Practices

Use use_autoprompt: true for natural language queries so Exa can optimize your search prompt for its neural search model — this often produces significantly better results than raw queries
Combine search types strategically — use neural search for conceptual research questions and keyword search for finding specific technical terms, error messages, or exact phrases
Filter by date for time-sensitive topics using start_published_date and end_published_date to avoid outdated information mixing with current results
Use domain filtering to focus research on authoritative sources by including trusted domains or excluding content farms and low-quality sources
Extract content with text: true when building research summaries to avoid separate web scraping calls; Exa returns clean extracted text without navigation, ads, and boilerplate

Common Issues

Autoprompt changing query intent: The autoprompt feature occasionally restructures queries in ways that shift the search focus. For precise, technical queries where exact phrasing matters, disable autoprompt and use keyword search type instead.

Rate limiting with large research batches: Sending many concurrent search requests hits API rate limits. Implement sequential processing with delays between requests, batch queries into groups with pauses, and cache results locally to avoid redundant searches when refining your research.

Content extraction returning incomplete text: Some websites block content extraction or have complex layouts that the extractor cannot parse fully. Check the extracted text length, fall back to browser-based extraction for important pages, and verify that the extracted content covers the key sections you need.

⚠️ Loading Issue

Pro Exa Workspace

Exa Workspace

When to Use

Quick Start

Core Concepts

Exa Search Types

Research Pipeline Architecture

Configuration

Best Practices

Common Issues

Reviews

Write a review

Similar Templates

Full-Stack Code Reviewer

Test Suite Generator

Pro Architecture Workspace