Literature Review Complete

Conduct systematic, comprehensive literature reviews following rigorous academic methodology. This skill guides you through database searching, study selection, critical appraisal, thematic synthesis, and producing professional review documents with proper citations and evidence grading.

When to Use This Skill

Choose Literature Review Complete when you need to:

Conduct a systematic review for a research proposal or thesis chapter
Survey existing work in a field before starting a new research project
Identify research gaps and formulate novel research questions
Create an evidence synthesis for clinical guidelines or policy documents

Consider alternatives when:

You need a quick overview of a topic (use a narrative summary approach)
You need to analyze specific datasets rather than published literature (use data analysis tools)
You need citation management only (use Zotero or Mendeley)

Quick Start


# Install literature search and management tools
pip install scholarly pyalex habanero python-dotenv


from pyalex import Works
import pandas as pd

# Search for relevant publications
results = Works().search("single cell RNA sequencing cancer").get()

# Extract key fields
papers = []
for work in results:
    papers.append({
        "title": work["title"],
        "year": work["publication_year"],
        "doi": work["doi"],
        "citations": work["cited_by_count"],
        "abstract": work.get("abstract", ""),
        "source": work["primary_location"]["source"]["display_name"]
            if work.get("primary_location", {}).get("source") else "Unknown"
    })

df = pd.DataFrame(papers)
df = df.sort_values("citations", ascending=False)
print(f"Found {len(df)} papers")
print(df[["title", "year", "citations"]].head(10))

Core Concepts

Review Methodology Steps

Phase	Activities	Output
1. Scoping	Define research questions, set inclusion/exclusion criteria	Protocol document
2. Searching	Query databases (PubMed, Scopus, OpenAlex, Google Scholar)	Raw results list
3. Screening	Title/abstract screening, full-text review	Included studies set
4. Extraction	Extract data points, methods, findings	Extraction table
5. Appraisal	Assess study quality and bias risk	Quality scores
6. Synthesis	Identify themes, compare findings, find gaps	Synthesis narrative
7. Writing	Structure review, format citations, create tables	Final document

Multi-Database Search Strategy


from pyalex import Works
from habanero import Crossref
import time

def comprehensive_search(query, max_results=200):
    """Search across multiple academic databases."""
    all_results = []
    seen_dois = set()

    # OpenAlex search
    openalex_results = (
        Works()
        .search(query)
        .filter(publication_year=">2019")
        .sort(cited_by_count="desc")
        .get(per_page=100)
    )
    for work in openalex_results:
        doi = work.get("doi", "")
        if doi and doi not in seen_dois:
            seen_dois.add(doi)
            all_results.append({
                "title": work["title"],
                "doi": doi,
                "year": work["publication_year"],
                "citations": work["cited_by_count"],
                "source": "OpenAlex"
            })

    # Crossref search
    cr = Crossref()
    cr_results = cr.works(
        query=query,
        filter={"from-pub-date": "2020"},
        sort="relevance",
        limit=100
    )
    for item in cr_results["message"]["items"]:
        doi = item.get("DOI", "")
        if doi and doi not in seen_dois:
            seen_dois.add(doi)
            all_results.append({
                "title": item.get("title", [""])[0],
                "doi": doi,
                "year": item.get("published-print", {}).get("date-parts", [[None]])[0][0],
                "citations": item.get("is-referenced-by-count", 0),
                "source": "Crossref"
            })

    return pd.DataFrame(all_results).sort_values("citations", ascending=False)

papers = comprehensive_search("CRISPR gene therapy clinical trials")
print(f"Total unique papers: {len(papers)}")

PRISMA Flow Tracking


class PRISMATracker:
    """Track study selection following PRISMA guidelines."""

    def __init__(self):
        self.stages = {
            "identified": 0,
            "duplicates_removed": 0,
            "screened": 0,
            "excluded_screening": 0,
            "full_text_assessed": 0,
            "excluded_full_text": 0,
            "included": 0,
            "exclusion_reasons": {}
        }

    def report(self):
        """Generate PRISMA flow summary."""
        s = self.stages
        print(f"Records identified: {s['identified']}")
        print(f"After duplicate removal: {s['identified'] - s['duplicates_removed']}")
        print(f"Screened: {s['screened']}")
        print(f"Excluded at screening: {s['excluded_screening']}")
        print(f"Full-text assessed: {s['full_text_assessed']}")
        print(f"Excluded at full-text: {s['excluded_full_text']}")
        print(f"Included in review: {s['included']}")
        if s['exclusion_reasons']:
            print("\nExclusion reasons:")
            for reason, count in s['exclusion_reasons'].items():
                print(f"  - {reason}: {count}")

tracker = PRISMATracker()
tracker.stages["identified"] = 342
tracker.stages["duplicates_removed"] = 87
tracker.stages["screened"] = 255
tracker.stages["excluded_screening"] = 180
tracker.stages["full_text_assessed"] = 75
tracker.stages["excluded_full_text"] = 23
tracker.stages["included"] = 52
tracker.stages["exclusion_reasons"] = {
    "Wrong study design": 8,
    "Wrong population": 7,
    "No relevant outcomes": 5,
    "Not peer-reviewed": 3
}
tracker.report()

Configuration

Parameter	Description	Default
`databases`	Academic databases to search	`["openalex", "crossref"]`
`date_range`	Publication year filter	`"2020-present"`
`language`	Publication language filter	`"en"`
`study_types`	Included study designs	`["all"]`
`quality_threshold`	Minimum quality score for inclusion	`0.6`
`citation_style`	Output citation format	`"apa7"`

Best Practices

Register your review protocol first — Write down your research questions, search strategy, inclusion/exclusion criteria, and synthesis plan before starting the search. This prevents bias from adjusting criteria to fit results after seeing the data.
Use Boolean operators systematically — Construct search strings with AND/OR/NOT and controlled vocabulary (MeSH terms for PubMed). Document the exact search string used for each database so the review is reproducible.
Screen with two independent reviewers — Have at least two people independently screen titles/abstracts and resolve disagreements by discussion. For automated approaches, use dual scoring thresholds and manually review borderline cases.
Extract data into a structured table — Create a standardized extraction form with columns for study design, population, intervention, outcomes, and quality indicators. Consistent extraction enables reliable cross-study comparison.
Report following PRISMA guidelines — Include the PRISMA flow diagram showing records at each stage, reasons for exclusion, and the final included set. This is required by most journals and demonstrates methodological rigor.

Common Issues

Too many or too few search results — Overly broad queries return thousands of irrelevant papers; overly narrow ones miss important studies. Start with a focused query, review the first 50 results for relevance, then iteratively broaden or narrow terms. Aim for 200-500 initial results for a manageable systematic review.

Duplicate detection across databases — The same paper appears in multiple databases with slightly different metadata. Match on DOI first (most reliable), then fall back to fuzzy title matching with a similarity threshold of 0.9. Exact title matching misses papers with special characters or formatting differences.

Citation format inconsistencies — Different databases return citations in different formats. Normalize all references to a single format (BibTeX or CSL-JSON) immediately after retrieval. Use tools like citeproc or Zotero's API to convert between citation styles consistently.

⚠️ Loading Issue

Literature Review Complete

Literature Review Complete

When to Use This Skill

Quick Start

Core Concepts

Review Methodology Steps

Multi-Database Search Strategy

PRISMA Flow Tracking

Configuration

Best Practices

Common Issues

Reviews

Write a review

Similar Templates

Full-Stack Code Reviewer

Test Suite Generator

Pro Architecture Workspace