O

Opentargets Database Elite

Battle-tested skill for query, open, targets, platform. Includes structured workflows, validation checks, and reusable patterns for scientific.

SkillClipticsscientificv1.0.0MIT
0 views0 copies

Open Targets Database Elite

Identify and prioritize therapeutic drug targets using the Open Targets Platform, which integrates genetics, genomics, transcriptomics, and chemical data. This skill covers target-disease association queries, evidence scoring, drug tractability assessment, and target prioritization for drug discovery research.

When to Use This Skill

Choose Open Targets Database Elite when you need to:

  • Find evidence linking genes/proteins to diseases for drug target identification
  • Score and rank potential drug targets by association strength and data type
  • Assess target tractability (druggability with small molecules, antibodies, or other modalities)
  • Explore known drugs and clinical candidates for specific targets or diseases

Consider alternatives when:

  • You need detailed protein structure data (use UniProt or PDB)
  • You need chemical compound screening data (use ChEMBL)
  • You need patient-level clinical data (use clinical data repositories)

Quick Start

pip install requests pandas matplotlib
import requests import pandas as pd BASE_URL = "https://api.platform.opentargets.org/api/v4" def get_target_disease_associations(target_id, size=25): """Get diseases associated with a target gene.""" query = """ query targetAssociations($ensemblId: String!, $size: Int!) { target(ensemblId: $ensemblId) { id approvedSymbol associatedDiseases(page: {size: $size, index: 0}) { count rows { disease { id name } score datatypeScores { id score } } } } } """ response = requests.post( f"{BASE_URL}/graphql", json={"query": query, "variables": { "ensemblId": target_id, "size": size }} ) data = response.json()["data"]["target"] rows = [] for assoc in data["associatedDiseases"]["rows"]: row = { "target": data["approvedSymbol"], "disease": assoc["disease"]["name"], "disease_id": assoc["disease"]["id"], "overall_score": assoc["score"], } for dt in assoc["datatypeScores"]: row[dt["id"]] = dt["score"] rows.append(row) return pd.DataFrame(rows) # Get diseases associated with BRAF associations = get_target_disease_associations("ENSG00000157764") print(associations[["target", "disease", "overall_score"]].head(10))

Core Concepts

Data Sources and Evidence Types

Evidence TypeSourceDescription
Genetic associationsGWAS Catalog, UK Biobank, FinnGenGenome-wide association studies
Somatic mutationsCOSMIC, IntOGen, Cancer Gene CensusCancer driver mutations
Known drugsChEMBL, ClinicalTrials.govDrug-target-disease relationships
PathwaysReactomePathway involvement
RNA expressionExpression AtlasDifferential expression in disease
LiteratureEurope PMCText-mined co-occurrences
Animal modelsPhenoDigmPhenotype similarity

Target Prioritization Pipeline

import requests import pandas as pd def prioritize_targets(disease_id, min_score=0.1, size=100): """Prioritize drug targets for a specific disease.""" query = """ query diseaseTargets($efoId: String!, $size: Int!) { disease(efoId: $efoId) { id name associatedTargets(page: {size: $size, index: 0}) { count rows { target { id approvedSymbol biotype tractability { label modality value } } score datatypeScores { id score } } } } } """ response = requests.post( f"{BASE_URL}/graphql", json={"query": query, "variables": { "efoId": disease_id, "size": size }} ) data = response.json()["data"]["disease"] targets = [] for assoc in data["associatedTargets"]["rows"]: if assoc["score"] < min_score: continue target = assoc["target"] row = { "gene": target["approvedSymbol"], "ensembl_id": target["id"], "biotype": target["biotype"], "overall_score": assoc["score"], } # Add evidence type scores for dt in assoc["datatypeScores"]: row[f"evidence_{dt['id']}"] = dt["score"] # Tractability assessment tractability = target.get("tractability", []) row["sm_tractable"] = any( t["value"] for t in tractability if t["modality"] == "SM" and "clinical" in t.get("label", "").lower() ) row["ab_tractable"] = any( t["value"] for t in tractability if t["modality"] == "AB" ) targets.append(row) df = pd.DataFrame(targets).sort_values("overall_score", ascending=False) # Prioritization summary print(f"Disease: {data['name']}") print(f"Total associated targets: {data['associatedTargets']['count']}") print(f"Targets above threshold: {len(df)}") print(f"Small-molecule tractable: {df['sm_tractable'].sum()}") print(f"Antibody tractable: {df['ab_tractable'].sum()}") return df # Prioritize targets for Alzheimer's disease targets = prioritize_targets("EFO_0000249") print(targets[["gene", "overall_score", "sm_tractable"]].head(15))

Drug Information for Targets

def get_drugs_for_target(target_id): """Get known drugs and clinical candidates for a target.""" query = """ query targetDrugs($ensemblId: String!) { target(ensemblId: $ensemblId) { approvedSymbol knownDrugs(size: 50) { count rows { drug { id name drugType maximumClinicalTrialPhase } disease { name } phase status urls { name url } } } } } """ response = requests.post( f"{BASE_URL}/graphql", json={"query": query, "variables": {"ensemblId": target_id}} ) data = response.json()["data"]["target"] drugs = [] for row in data["knownDrugs"]["rows"]: drugs.append({ "target": data["approvedSymbol"], "drug_name": row["drug"]["name"], "drug_type": row["drug"]["drugType"], "max_phase": row["drug"]["maximumClinicalTrialPhase"], "indication": row["disease"]["name"], "trial_phase": row["phase"], "status": row["status"] }) df = pd.DataFrame(drugs) print(f"Known drugs for {data['approvedSymbol']}: {len(df)}") return df # Get drugs targeting EGFR drugs = get_drugs_for_target("ENSG00000146648") print(drugs[["drug_name", "drug_type", "max_phase", "indication"]].head(10))

Configuration

ParameterDescriptionDefault
api_urlOpen Targets GraphQL endpoint"https://api.platform.opentargets.org/api/v4/graphql"
page_sizeResults per query page25
min_scoreMinimum association score threshold0.1
evidence_typesEvidence types to includeAll types
timeoutAPI request timeout (seconds)30
cacheCache API responsestrue

Best Practices

  1. Use Ensembl gene IDs for targets — Open Targets uses Ensembl IDs (e.g., ENSG00000157764 for BRAF) as primary identifiers. Convert gene symbols to Ensembl IDs using the search endpoint before querying, as the same symbol can map to different genes across species.

  2. Evaluate multiple evidence types — A high overall score driven by a single evidence type (e.g., literature mining alone) is weaker than moderate scores across genetics, expression, and known drugs. Check the datatypeScores breakdown to assess evidence breadth.

  3. Filter by tractability for actionable targets — A strongly associated target that cannot be modulated by any known drug modality has limited immediate value. Use tractability assessments to focus on targets that are druggable with small molecules, antibodies, or other approaches.

  4. Cross-reference with ChEMBL for compound data — Open Targets provides target-disease associations, but detailed compound activity data lives in ChEMBL. Use the target's ChEMBL ID to find bioactivity assays, compound structures, and potency data.

  5. Use EFO disease ontology for consistent disease queries — Diseases in Open Targets use the Experimental Factor Ontology (EFO). Search for the correct EFO ID through the platform search before building queries, as disease naming varies across sources.

Common Issues

GraphQL query returns null data — The most common cause is an incorrect Ensembl ID or EFO disease ID. Verify IDs using the Open Targets platform search (https://platform.opentargets.org) before building API queries. Also check that the query structure matches the current API version — the schema evolves between releases.

Association scores seem low for well-known drug targets — Open Targets scores range from 0 to 1 and represent evidence strength, not biological importance. A score of 0.3 with genetic evidence can be more meaningful than a score of 0.8 from literature co-occurrence alone. Always interpret scores in the context of evidence types.

Rate limiting on bulk queries — The Open Targets API is generous but has rate limits for very large queries. When downloading associations for hundreds of targets, add time.sleep(0.5) between requests and use the batch association endpoint when available. For comprehensive dataset downloads, use the Open Targets data downloads (Parquet files) instead of the API.

Community

Reviews

Write a review

No reviews yet. Be the first to review this template!

Similar Templates