Cosmic Database Toolkit
Powerful skill for access, cosmic, cancer, mutation. Includes structured workflows, validation checks, and reusable patterns for scientific.
COSMIC Database Toolkit
A scientific computing skill for querying COSMIC (Catalogue of Somatic Mutations in Cancer) — the world's largest database of somatic mutations in human cancer, maintained by the Wellcome Sanger Institute. COSMIC Database Toolkit helps you search for cancer-associated mutations, retrieve mutation frequency data, and analyze mutational signatures across cancer types.
When to Use This Skill
Choose COSMIC Database Toolkit when:
- Searching for somatic mutations associated with specific cancer types
- Looking up mutation frequencies for known cancer genes (TP53, KRAS, EGFR)
- Retrieving cancer gene census data for oncogenes and tumor suppressors
- Analyzing mutational patterns and signatures in specific tumor types
Consider alternatives when:
- You need germline variant data (use ClinVar or gnomAD)
- You need clinical trial data for cancer drugs (use ClinicalTrials.gov)
- You need gene expression in cancer (use TCGA via GDC)
- You need drug sensitivity data (use GDSC or DepMap)
Quick Start
claude "Find the most common TP53 mutations in lung cancer from COSMIC"
import requests # COSMIC API requires authentication # Register at cancer.sanger.ac.uk for API access headers = { "Authorization": "Bearer YOUR_API_TOKEN" } # Search for TP53 mutations in lung cancer url = "https://cancer.sanger.ac.uk/cosmic/api/v1/mutations" params = { "gene": "TP53", "tumour_site": "lung", "sort": "count:desc", "limit": 10 } response = requests.get(url, headers=headers, params=params) mutations = response.json() for mut in mutations: print(f"Mutation: {mut['mutation_cds']}") print(f" Protein: {mut['mutation_aa']}") print(f" Count: {mut['count']} samples") print(f" Type: {mut['mutation_description']}")
Core Concepts
COSMIC Data Categories
| Category | Description | Example |
|---|---|---|
| Cancer Gene Census | Curated list of cancer genes | TP53, KRAS, BRCA1 |
| Somatic Mutations | Point mutations, indels | TP53 R248W |
| Copy Number | Amplifications, deletions | ERBB2 amplification |
| Gene Fusions | Translocation-derived fusions | BCR-ABL1 |
| Mutational Signatures | Patterns of base changes | SBS1 (clock-like), SBS4 (smoking) |
| Drug Resistance | Mutations conferring resistance | EGFR T790M |
Cancer Gene Census
# The Cancer Gene Census — curated cancer genes def get_cancer_gene_census(headers): """Retrieve the COSMIC Cancer Gene Census""" url = "https://cancer.sanger.ac.uk/cosmic/api/v1/cancer-gene-census" response = requests.get(url, headers=headers) census = response.json() oncogenes = [g for g in census if "oncogene" in g.get("role", "").lower()] tsgs = [g for g in census if "tsg" in g.get("role", "").lower()] print(f"Total cancer genes: {len(census)}") print(f"Oncogenes: {len(oncogenes)}") print(f"Tumor suppressors: {len(tsgs)}") return census # Lookup specific gene def gene_mutation_profile(gene, headers): """Get mutation profile for a cancer gene""" url = f"https://cancer.sanger.ac.uk/cosmic/api/v1/gene/{gene}" response = requests.get(url, headers=headers) return response.json()
Mutational Signatures
# COSMIC Mutational Signatures (SBS, DBS, ID) signatures = { "SBS1": "Spontaneous deamination (age-related, clock-like)", "SBS2": "APOBEC activity", "SBS4": "Tobacco smoking", "SBS6": "Defective DNA mismatch repair (MSI)", "SBS7a": "UV light exposure (melanoma)", "SBS10a": "POLE proofreading deficiency", "SBS13": "APOBEC activity (alternative)", "SBS22": "Aristolochic acid exposure", } # Dominant signatures by cancer type cancer_signatures = { "melanoma": ["SBS7a", "SBS7b"], "lung_squamous": ["SBS4", "SBS2"], "colorectal_MSI": ["SBS6", "SBS15"], "breast": ["SBS1", "SBS2", "SBS13"], }
Configuration
| Parameter | Description | Default |
|---|---|---|
api_token | COSMIC API authentication token | Required |
genome_build | GRCh37 or GRCh38 | GRCh38 |
result_limit | Max results per query | 100 |
tumour_site | Filter by cancer type | None (all) |
mutation_type | SNV, insertion, deletion, complex | None (all) |
Best Practices
-
Use the Cancer Gene Census as your starting gene list. The CGC is expertly curated — start with known cancer genes rather than searching the full COSMIC database. It distinguishes oncogenes from tumor suppressors, guiding interpretation.
-
Filter by primary tissue type. A mutation's significance varies by cancer type. KRAS G12D is common in pancreatic cancer but rare in melanoma. Always contextualize mutation frequencies by the specific cancer type under investigation.
-
Check sample count, not just mutation presence. COSMIC reports how many samples carry each mutation. A mutation found in 1 sample out of 50,000 screened is very different from one found in 1,000. Use frequency data to prioritize significant mutations.
-
Combine COSMIC with functional databases. COSMIC catalogues mutations but doesn't always assess their functional impact. Cross-reference top mutations with functional data from DepMap (essentiality), OncoKB (clinical actionability), or ClinVar (pathogenicity).
-
Account for detection bias. Highly studied genes (TP53, KRAS) have more data than rarely studied genes. High mutation counts may reflect high screening frequency rather than biological importance. Consider the number of samples screened for each gene.
Common Issues
API access denied or rate limited. COSMIC requires registration for API access. Free academic accounts have rate limits — add delays between requests. For large-scale data needs, download the COSMIC data files directly rather than using the API.
Mutation nomenclature doesn't match between databases. COSMIC uses its own mutation IDs (COSV/COSM) alongside HGVS nomenclature. When cross-referencing with ClinVar or gnomAD, map via genomic coordinates (chromosome, position, ref, alt) rather than mutation names.
Different cancer type classifications between studies. COSMIC uses specific primary site and histology classifications that may not match other databases' terminology. Lung cancer in COSMIC splits into multiple subtypes — search by primary site first, then refine by histology.
Reviews
No reviews yet. Be the first to review this template!
Similar Templates
Full-Stack Code Reviewer
Comprehensive code review skill that checks for security vulnerabilities, performance issues, accessibility, and best practices across frontend and backend code.
Test Suite Generator
Generates comprehensive test suites with unit tests, integration tests, and edge cases. Supports Jest, Vitest, Pytest, and Go testing.
Pro Architecture Workspace
Battle-tested skill for architectural, decision, making, framework. Includes structured workflows, validation checks, and reusable patterns for development.