Agents Crewai Toolkit
Boost productivity using this multi, agent, orchestration, framework. Includes structured workflows, validation checks, and reusable patterns for ai research.
CrewAI - Multi-Agent Orchestration Framework
Overview
CrewAI is a Python framework for building teams of specialized AI agents that collaborate autonomously to complete complex tasks. The core idea is simple: instead of one monolithic prompt doing everything, you define multiple agents -- each with a distinct role, goal, and backstory -- and let them work together through structured processes (sequential or hierarchical). Think of it as building a virtual team where a researcher gathers data, an analyst processes it, and a writer produces the final report.
CrewAI stands out because it is entirely standalone (no LangChain dependency), lightweight to install, and provides two complementary paradigms: Crews for autonomous multi-agent collaboration and Flows for event-driven orchestration with explicit state management. It ships with 50+ built-in tools, a YAML-based configuration system, and supports any LLM provider through LiteLLM.
When to Use
- Building multi-agent workflows where each agent has a clear specialty (researcher, coder, editor, analyst)
- Automating content pipelines: research --> write --> edit --> publish
- Creating autonomous data analysis teams that gather, process, and report on information
- Running sequential or hierarchical task chains where output from one agent feeds into the next
- Delegating tasks dynamically at runtime via a hierarchical manager agent
- Building event-driven workflows with conditional routing using Flows
- Need a simpler, more opinionated alternative to LangChain/LangGraph for multi-agent scenarios
- Production systems requiring built-in memory, caching, rate limiting, and observability
Quick Start
Installation
# Core framework pip install crewai # Include 50+ built-in tools (web search, scraping, file I/O, PDF parsing) pip install 'crewai[tools]' # Set your LLM API key export OPENAI_API_KEY="sk-..." # Or for Anthropic: export ANTHROPIC_API_KEY="sk-ant-..."
CLI Project Scaffolding
# Generate a complete project structure crewai create crew market_analysis cd market_analysis # Install project dependencies crewai install # Execute the crew crewai run
Minimal Code Example (no project structure needed)
from crewai import Agent, Task, Crew, Process # Define specialized agents researcher = Agent( role="Market Research Analyst", goal="Find comprehensive data about the target market", backstory=( "You are a senior market analyst at a top consulting firm. " "You have 15 years of experience analyzing technology markets " "and identifying emerging trends before they become mainstream." ), verbose=True, llm="gpt-4o" ) strategist = Agent( role="Business Strategist", goal="Develop actionable market entry strategies based on research data", backstory=( "You are a former McKinsey partner who now advises startups. " "You excel at translating market data into concrete go-to-market plans." ), verbose=True, llm="gpt-4o" ) # Define tasks with explicit dependencies research_task = Task( description=( "Research the {industry} market in {region}. Identify the top 5 competitors, " "market size, growth rate, and key trends for 2025-2026." ), expected_output="A structured report with market size, 5 competitors, growth rate, and 3 key trends.", agent=researcher ) strategy_task = Task( description="Based on the research, create a market entry strategy for a new {product_type} product.", expected_output="A 1-page strategy document with positioning, pricing, and go-to-market channels.", agent=strategist, context=[research_task], # Receives research output as context output_file="strategy.md" ) # Assemble and run the crew crew = Crew( agents=[researcher, strategist], tasks=[research_task, strategy_task], process=Process.sequential, verbose=True ) result = crew.kickoff(inputs={ "industry": "AI developer tools", "region": "North America", "product_type": "code review automation" }) print(result.raw) print(f"Token usage: {result.token_usage}")
Core Concepts
Agents
An Agent is an autonomous worker defined by three personality attributes (role, goal, backstory) plus configuration for its LLM, available tools, and behavioral constraints.
from crewai import Agent, LLM # Use any LLM provider via LiteLLM llm = LLM(model="claude-sonnet-4-5-20250929") # Anthropic # llm = LLM(model="gpt-4o") # OpenAI # llm = LLM(model="ollama/llama3.1", base_url="http://localhost:11434") # Local agent = Agent( role="Security Auditor", goal="Identify vulnerabilities in code and suggest fixes", backstory="You are a CISSP-certified security engineer who has audited Fortune 500 codebases.", llm=llm, tools=[], # Tools this agent can use memory=True, # Enable short/long-term memory verbose=True, # Log reasoning steps allow_delegation=True, # Can delegate subtasks to other agents max_iter=15, # Max reasoning iterations before stopping max_rpm=10, # Rate limit (requests per minute) max_retry_limit=2 # Retry failed tool calls )
Tasks
A Task is a unit of work assigned to an agent. Tasks define what needs to be done, what the expected output looks like, and how they connect to other tasks.
from crewai import Task task = Task( description=( "Audit the authentication module for SQL injection, XSS, and " "CSRF vulnerabilities. Provide severity ratings for each finding." ), expected_output=( "A vulnerability report in markdown with: finding description, " "severity (critical/high/medium/low), and recommended fix." ), agent=security_auditor, context=[code_review_task], # Input from a previous task output_file="audit_report.md", # Automatically save to file async_execution=False, # Run synchronously (default) human_input=False # No human-in-the-loop approval )
Crews
A Crew is a team of agents working together to complete a list of tasks through a defined process.
from crewai import Crew, Process crew = Crew( agents=[researcher, auditor, writer], tasks=[research_task, audit_task, report_task], process=Process.sequential, # Tasks run one after another verbose=True, memory=True, # Shared memory across agents cache=True, # Cache tool results max_rpm=10, # Global rate limit output_log_file="crew.log" # Log all activity ) # Execute with variable substitution result = crew.kickoff(inputs={"target": "api-server", "scope": "authentication"}) # Access detailed results print(result.raw) # Final text output print(result.tasks_output) # Individual task outputs print(result.token_usage) # Total token consumption
Process Types
Sequential Process
Tasks execute in order. Each task's output becomes available as context to subsequent tasks.
crew = Crew( agents=[researcher, writer, editor], tasks=[research_task, write_task, edit_task], process=Process.sequential # Flow: research_task -> write_task -> edit_task )
Hierarchical Process
CrewAI automatically creates a manager agent that dynamically delegates tasks, reviews results, and re-assigns work if quality is insufficient.
crew = Crew( agents=[researcher, writer, analyst], tasks=[research_task, write_task, analyze_task], process=Process.hierarchical, manager_llm="gpt-4o" # LLM powering the auto-created manager ) # Manager decides: who does what, in what order, and validates output quality
Tools
Built-in Tools (50+)
from crewai_tools import ( SerperDevTool, # Google search via Serper API ScrapeWebsiteTool, # Extract content from web pages FileReadTool, # Read local files FileWriterTool, # Write to local files PDFSearchTool, # Search within PDF documents CodeDocsSearchTool, # Search code documentation GithubSearchTool, # Search GitHub repositories YoutubeVideoSearchTool, # Search YouTube transcripts DirectoryReadTool, # List directory contents WebsiteSearchTool, # RAG over a website ) # Assign tools to an agent researcher = Agent( role="Researcher", goal="Find accurate, up-to-date information", backstory="Expert researcher with access to web search and documents.", tools=[ SerperDevTool(), ScrapeWebsiteTool(), PDFSearchTool(pdf="quarterly_report.pdf") ] )
Custom Tools
from crewai.tools import BaseTool from pydantic import Field import subprocess class GitDiffTool(BaseTool): name: str = "GitDiff" description: str = "Get the git diff for a repository. Input: path to repository." def _run(self, repo_path: str) -> str: try: result = subprocess.run( ["git", "diff", "--stat"], cwd=repo_path, capture_output=True, text=True, timeout=30 ) return result.stdout if result.stdout else "No changes detected." except subprocess.TimeoutExpired: return "Error: git diff timed out after 30 seconds." except Exception as e: return f"Error running git diff: {str(e)}" # Use the custom tool developer = Agent( role="Code Reviewer", goal="Review code changes for quality and correctness", tools=[GitDiffTool()] )
YAML Configuration (Production Pattern)
Project structure
my_crew/
āāā src/my_crew/
ā āāā config/
ā ā āāā agents.yaml # Agent definitions
ā ā āāā tasks.yaml # Task definitions
ā āāā tools/
ā ā āāā custom_tool.py # Custom tool implementations
ā āāā crew.py # Crew assembly with decorators
ā āāā main.py # Entry point
āāā pyproject.toml
āāā .env # API keys
agents.yaml
lead_researcher: role: "Senior {domain} Researcher" goal: "Produce thorough, accurate research on {topic}" backstory: > You have a PhD in {domain} and 20 years of industry experience. Your research has been cited in major publications and you are known for identifying trends 6-12 months before they go mainstream. technical_writer: role: "Technical Writer" goal: "Transform research into clear, actionable documentation" backstory: > You are a former senior technical writer at Google who specializes in making complex technical topics accessible. You write in a direct, no-nonsense style with concrete examples.
tasks.yaml
research_task: description: > Conduct a deep dive into {topic}. Cover market size, key players, technology trends, and potential disruptions for {year}. Focus on quantitative data wherever possible. expected_output: > A research report with: executive summary (3 sentences), 5 key findings with supporting data, and 3 predictions for the next 12 months. agent: lead_researcher documentation_task: description: > Using the research findings, create a comprehensive guide about {topic}. Include code examples where applicable. expected_output: > A markdown document (~2000 words) with sections for overview, key findings, practical implications, and next steps. agent: technical_writer output_file: "output/{topic}_guide.md"
crew.py
from crewai import Agent, Crew, Process, Task from crewai.project import CrewBase, agent, crew, task from crewai_tools import SerperDevTool, ScrapeWebsiteTool @CrewBase class MarketAnalysisCrew: """Market analysis crew with YAML-configured agents and tasks.""" @agent def lead_researcher(self) -> Agent: return Agent( config=self.agents_config['lead_researcher'], tools=[SerperDevTool(), ScrapeWebsiteTool()], verbose=True ) @agent def technical_writer(self) -> Agent: return Agent( config=self.agents_config['technical_writer'], verbose=True ) @task def research_task(self) -> Task: return Task(config=self.tasks_config['research_task']) @task def documentation_task(self) -> Task: return Task( config=self.tasks_config['documentation_task'], output_file='output/guide.md' ) @crew def crew(self) -> Crew: return Crew( agents=self.agents, tasks=self.tasks, process=Process.sequential, verbose=True, memory=True )
Flows -- Event-Driven Orchestration
Flows let you build complex, conditional workflows that chain multiple crews with explicit state management and routing logic.
from crewai.flow.flow import Flow, listen, start, router from pydantic import BaseModel class PipelineState(BaseModel): raw_data: str = "" quality_score: float = 0.0 final_report: str = "" class DataPipeline(Flow[PipelineState]): @start() def ingest_data(self): # First step: gather raw data result = data_collection_crew.kickoff(inputs={"source": "quarterly_reports"}) self.state.raw_data = result.raw return result.raw @listen(ingest_data) def analyze_quality(self, data): # Second step: analyze data quality result = quality_analysis_crew.kickoff(inputs={"data": data}) self.state.quality_score = float(result.raw.split("Score: ")[1][:4]) return self.state.quality_score @router(analyze_quality) def route_by_quality(self): # Route based on quality score if self.state.quality_score >= 0.8: return "high_quality" return "needs_cleanup" @listen("high_quality") def generate_report(self): result = report_crew.kickoff(inputs={"data": self.state.raw_data}) self.state.final_report = result.raw return result.raw @listen("needs_cleanup") def clean_and_retry(self): cleaned = cleanup_crew.kickoff(inputs={"data": self.state.raw_data}) self.state.raw_data = cleaned.raw # Re-trigger analysis return self.analyze_quality(cleaned.raw) # Execute the flow pipeline = DataPipeline() result = pipeline.kickoff() print(pipeline.state.final_report)
Memory System
CrewAI provides three types of memory, all opt-in:
| Memory Type | Storage | Purpose |
|---|---|---|
| Short-term | ChromaDB (in-memory) | Context within current execution |
| Long-term | SQLite (persistent) | Learnings across executions |
| Entity | ChromaDB (persistent) | Facts about people, companies, concepts |
crew = Crew( agents=[researcher], tasks=[research_task], memory=True, # Enable all memory types embedder={ # Custom embedding model "provider": "openai", "config": {"model": "text-embedding-3-small"} } ) # Override storage location import os os.environ["CREWAI_STORAGE_DIR"] = "/path/to/persistent/storage"
Configuration Reference
Agent Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
role | str | required | Agent's job title and specialty |
goal | str | required | What the agent aims to achieve |
backstory | str | required | Background context shaping behavior |
llm | str/LLM | "gpt-4o" | LLM model to use |
tools | list | [] | Available tools |
memory | bool | False | Enable memory |
verbose | bool | False | Log reasoning |
allow_delegation | bool | False | Can delegate to other agents |
max_iter | int | 15 | Max reasoning iterations |
max_rpm | int | None | Rate limit (requests/min) |
max_retry_limit | int | 2 | Retry failed operations |
Crew Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
agents | list | required | Team members |
tasks | list | required | Work items |
process | Process | sequential | Execution strategy |
memory | bool | False | Enable shared memory |
cache | bool | True | Cache tool results |
max_rpm | int | None | Global rate limit |
verbose | bool | False | Log execution details |
manager_llm | str | None | LLM for hierarchical manager |
output_log_file | str | None | Path to log file |
Best Practices
-
Give each agent a distinct, non-overlapping role. Vague or overlapping roles cause agents to produce redundant work. A "Senior Python Developer" and a "Code Review Specialist" are better than two generic "developers."
-
Write detailed backstories. The backstory is not flavor text -- it shapes the LLM's reasoning. Include years of experience, specific skills, and personality traits that affect output quality.
-
Use YAML configuration for anything beyond prototyping. YAML separates prompt engineering from code logic, making it easier to iterate on agent behavior without code changes.
-
Set
max_iterdefensively. The default of 15 is fine for most tasks, but set it lower (5-8) for simple tasks to prevent unnecessary token consumption and higher (20-25) for complex research. -
Limit tools to 3-5 per agent. Too many tools confuse the LLM's tool selection. Give each agent only the tools relevant to its role.
-
Enable memory for multi-execution workflows. Long-term memory lets agents learn from previous runs, improving quality over time.
-
Use
contextexplicitly. Always specify which prior tasks feed into a new task via thecontextparameter. Do not rely on implicit passing. -
Rate-limit aggressively. Set
max_rpmat both agent and crew levels. API rate limits are the most common production failure. -
Test with
verbose=Trueduring development. Watch the reasoning process to catch prompt issues early. Disable in production. -
Use Flows for anything with conditional logic. If your workflow needs branching, retries, or state tracking, Flows are cleaner than trying to encode that logic in task descriptions.
Troubleshooting
Agent stuck in reasoning loop:
# Reduce max_iter to force early termination agent = Agent(role="...", goal="...", backstory="...", max_iter=8) # Also check: is the task description too vague? Ambiguous tasks cause loops.
Task not receiving context from previous task:
# Explicitly declare the dependency task_b = Task( description="...", context=[task_a], # Must be a list of Task objects agent=writer )
Rate limit errors (429):
crew = Crew( agents=[...], tasks=[...], max_rpm=5 # Global limit across all agents ) # Also set per-agent: Agent(..., max_rpm=3)
Memory errors or storage conflicts:
# Set explicit storage directory export CREWAI_STORAGE_DIR="./crewai_data" # Clear stale memory rm -rf ./crewai_data/chroma ./crewai_data/sqlite
Wrong LLM being used:
from crewai import LLM # Be explicit about the model llm = LLM(model="claude-sonnet-4-5-20250929") agent = Agent(role="...", goal="...", backstory="...", llm=llm) # Check: OPENAI_API_KEY overrides can cause fallback to GPT
Comparison with Alternatives
| Feature | CrewAI | LangChain | LangGraph | AutoGen |
|---|---|---|---|---|
| Best for | Multi-agent teams | General LLM apps | Stateful workflows | Conversational agents |
| Learning curve | Low | Medium | High | Medium |
| Agent paradigm | Role-based | Tool-based | Graph-based | Chat-based |
| Multi-agent | Native | Limited | Via sub-graphs | Native |
| Memory | Built-in (3 types) | Plugin-based | Custom state | Built-in |
| Tools | 50+ built-in | 500+ integrations | Inherits LangChain | Custom |
| Config format | YAML + Python | Python | Python | Python/JSON |
| Standalone | Yes | Yes | Requires LangChain | Yes |
Resources
- GitHub: https://github.com/crewAIInc/crewAI (25k+ stars)
- Documentation: https://docs.crewai.com
- Tools: https://github.com/crewAIInc/crewAI-tools
- Examples: https://github.com/crewAIInc/crewAI-examples
- Version: 1.2.0+
- License: MIT
Reviews
No reviews yet. Be the first to review this template!
Similar Templates
Full-Stack Code Reviewer
Comprehensive code review skill that checks for security vulnerabilities, performance issues, accessibility, and best practices across frontend and backend code.
Test Suite Generator
Generates comprehensive test suites with unit tests, integration tests, and edge cases. Supports Jest, Vitest, Pytest, and Go testing.
Pro Architecture Workspace
Battle-tested skill for architectural, decision, making, framework. Includes structured workflows, validation checks, and reusable patterns for development.