P

Pro Clinvar Workspace

Production-ready skill that handles query, ncbi, clinvar, variant. Includes structured workflows, validation checks, and reusable patterns for scientific.

SkillClipticsscientificv1.0.0MIT
0 views0 copies

Pro ClinVar Workspace

A scientific computing skill for querying ClinVar — NCBI's public archive of human genetic variants and their clinical significance. Pro ClinVar Workspace helps you search for variant-disease associations, retrieve pathogenicity classifications, and integrate clinical variant data into genomics research workflows.

When to Use This Skill

Choose Pro ClinVar Workspace when:

  • Looking up the clinical significance of human genetic variants
  • Searching for pathogenic variants associated with specific diseases
  • Retrieving variant classifications from multiple submitting labs
  • Building variant interpretation pipelines with ClinVar annotations

Consider alternatives when:

  • You need population allele frequencies (use gnomAD)
  • You need cancer-specific somatic variants (use COSMIC)
  • You need pharmacogenomic variants (use ClinPGx/PharmVar)
  • You need protein structure impact predictions (use AlphaFold + variant tools)

Quick Start

claude "Search ClinVar for pathogenic BRCA1 variants"
from Bio import Entrez import json Entrez.email = "[email protected]" # Search ClinVar for BRCA1 pathogenic variants handle = Entrez.esearch( db="clinvar", term="BRCA1[gene] AND pathogenic[clinical significance]", retmax=10 ) results = Entrez.read(handle) print(f"Total pathogenic BRCA1 variants: {results['Count']}") # Fetch variant details for var_id in results["IdList"]: handle = Entrez.efetch(db="clinvar", id=var_id, rettype="vcv", is_variationid="true", from_esearch="true") record = handle.read() print(f"Variant ID: {var_id}")
# Using ClinVar API directly import requests # Search via NCBI E-utilities url = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi" params = { "db": "clinvar", "term": "BRCA1[gene] AND pathogenic[clinsig]", "retmode": "json", "retmax": 20 } response = requests.get(url, params=params) data = response.json() ids = data["esearchresult"]["idlist"] print(f"Found {len(ids)} variants")

Core Concepts

ClinVar Classification System

ClassificationMeaningAction
PathogenicCauses diseaseReport, clinical action
Likely PathogenicProbably causes diseaseReport, consider clinical action
Uncertain Significance (VUS)Insufficient evidenceReport, no clinical action
Likely BenignProbably harmlessMay omit from report
BenignHarmlessOmit from report

Variant Identifiers

# ClinVar uses multiple identifier systems variant_ids = { "variation_id": 12345, # ClinVar internal ID "rcv": "RCV000012345", # Reference ClinVar accession "scv": "SCV000012345", # Submitter ClinVar accession "rs_id": "rs28897696", # dbSNP rsID "hgvs_c": "NM_007294.4:c.5266dupC", # Coding DNA HGVS "hgvs_p": "NP_009225.1:p.Gln1756ProfsTer74", # Protein HGVS }

Batch Variant Lookup

def batch_clinvar_lookup(variant_list): """Look up multiple variants in ClinVar""" results = [] for variant in variant_list: handle = Entrez.esearch( db="clinvar", term=f"{variant}[variant name]" ) search = Entrez.read(handle) if search["IdList"]: handle = Entrez.esummary( db="clinvar", id=search["IdList"][0] ) summary = Entrez.read(handle) doc = summary["DocumentSummarySet"]["DocumentSummary"][0] results.append({ "variant": variant, "clinical_significance": doc.get("clinical_significance", {}).get("description", "N/A"), "gene": doc.get("genes", [{}])[0].get("symbol", "N/A"), "conditions": [t.get("trait_name", "") for t in doc.get("trait_set", [])] }) return results variants = ["NM_007294.4:c.5266dupC", "NM_000546.5:c.215C>G"] results = batch_clinvar_lookup(variants)

Configuration

ParameterDescriptionDefault
Entrez.emailRequired email for NCBI APIRequired
Entrez.api_keyNCBI API key for higher limitsNone
significance_filterFilter by clinical significanceNone (all)
review_status_minMinimum review star ratingNone
assemblyReference genome (GRCh37 or GRCh38)GRCh38

Best Practices

  1. Filter by review status for clinical use. ClinVar assigns star ratings (0-4) based on the level of review. For clinical reporting, use variants with ≥ 2 stars (criteria provided, multiple submitters). Single-submitter classifications may not be validated.

  2. Check for conflicting interpretations. Different labs may classify the same variant differently. ClinVar flags these conflicts. Always review the individual submitter classifications (SCV records) when a variant has conflicting interpretations.

  3. Use HGVS nomenclature for precise variant identification. rsIDs can be ambiguous (one rsID may map to multiple variants). HGVS notation (e.g., NM_007294.4:c.5266dupC) is unambiguous and includes the reference transcript version.

  4. Cross-reference with population databases. A variant classified as pathogenic in ClinVar should be rare in the general population. Cross-check allele frequency in gnomAD — if a "pathogenic" variant has >1% frequency, the classification may warrant review.

  5. Subscribe to variant reclassification updates. ClinVar classifications change as new evidence emerges. For clinically important variants, set up alerts for reclassifications, especially for VUS that may be upgraded or downgraded.

Common Issues

Search returns no results for a known variant. The variant may be described using different nomenclature. Try searching by rsID, HGVS notation, gene name + genomic position, or the protein change. ClinVar indexes variants in multiple ways.

Variant shows "conflicting interpretations." This means different submitting labs disagree on clinical significance. Review each submitter's evidence (SCV records), check their submission dates, and consider the most recent expert panel review if available.

Coordinate mapping errors between genome builds. ClinVar reports positions on both GRCh37 and GRCh38. Ensure your query coordinates match the expected assembly. Use LiftOver tools to convert between assemblies if needed.

Community

Reviews

Write a review

No reviews yet. Be the first to review this template!

Similar Templates