Open Targets Database Elite

Identify and prioritize therapeutic drug targets using the Open Targets Platform, which integrates genetics, genomics, transcriptomics, and chemical data. This skill covers target-disease association queries, evidence scoring, drug tractability assessment, and target prioritization for drug discovery research.

When to Use This Skill

Choose Open Targets Database Elite when you need to:

Find evidence linking genes/proteins to diseases for drug target identification
Score and rank potential drug targets by association strength and data type
Assess target tractability (druggability with small molecules, antibodies, or other modalities)
Explore known drugs and clinical candidates for specific targets or diseases

Consider alternatives when:

You need detailed protein structure data (use UniProt or PDB)
You need chemical compound screening data (use ChEMBL)
You need patient-level clinical data (use clinical data repositories)

Quick Start


pip install requests pandas matplotlib


import requests
import pandas as pd

BASE_URL = "https://api.platform.opentargets.org/api/v4"

def get_target_disease_associations(target_id, size=25):
    """Get diseases associated with a target gene."""
    query = """
    query targetAssociations($ensemblId: String!, $size: Int!) {
      target(ensemblId: $ensemblId) {
        id
        approvedSymbol
        associatedDiseases(page: {size: $size, index: 0}) {
          count
          rows {
            disease { id name }
            score
            datatypeScores { id score }
          }
        }
      }
    }
    """
    response = requests.post(
        f"{BASE_URL}/graphql",
        json={"query": query, "variables": {
            "ensemblId": target_id, "size": size
        }}
    )
    data = response.json()["data"]["target"]

    rows = []
    for assoc in data["associatedDiseases"]["rows"]:
        row = {
            "target": data["approvedSymbol"],
            "disease": assoc["disease"]["name"],
            "disease_id": assoc["disease"]["id"],
            "overall_score": assoc["score"],
        }
        for dt in assoc["datatypeScores"]:
            row[dt["id"]] = dt["score"]
        rows.append(row)

    return pd.DataFrame(rows)

# Get diseases associated with BRAF
associations = get_target_disease_associations("ENSG00000157764")
print(associations[["target", "disease", "overall_score"]].head(10))

Core Concepts

Data Sources and Evidence Types

Evidence Type	Source	Description
Genetic associations	GWAS Catalog, UK Biobank, FinnGen	Genome-wide association studies
Somatic mutations	COSMIC, IntOGen, Cancer Gene Census	Cancer driver mutations
Known drugs	ChEMBL, ClinicalTrials.gov	Drug-target-disease relationships
Pathways	Reactome	Pathway involvement
RNA expression	Expression Atlas	Differential expression in disease
Literature	Europe PMC	Text-mined co-occurrences
Animal models	PhenoDigm	Phenotype similarity

Target Prioritization Pipeline


import requests
import pandas as pd

def prioritize_targets(disease_id, min_score=0.1, size=100):
    """Prioritize drug targets for a specific disease."""
    query = """
    query diseaseTargets($efoId: String!, $size: Int!) {
      disease(efoId: $efoId) {
        id
        name
        associatedTargets(page: {size: $size, index: 0}) {
          count
          rows {
            target {
              id
              approvedSymbol
              biotype
              tractability {
                label
                modality
                value
              }
            }
            score
            datatypeScores { id score }
          }
        }
      }
    }
    """
    response = requests.post(
        f"{BASE_URL}/graphql",
        json={"query": query, "variables": {
            "efoId": disease_id, "size": size
        }}
    )
    data = response.json()["data"]["disease"]

    targets = []
    for assoc in data["associatedTargets"]["rows"]:
        if assoc["score"] < min_score:
            continue

        target = assoc["target"]
        row = {
            "gene": target["approvedSymbol"],
            "ensembl_id": target["id"],
            "biotype": target["biotype"],
            "overall_score": assoc["score"],
        }

        # Add evidence type scores
        for dt in assoc["datatypeScores"]:
            row[f"evidence_{dt['id']}"] = dt["score"]

        # Tractability assessment
        tractability = target.get("tractability", [])
        row["sm_tractable"] = any(
            t["value"] for t in tractability
            if t["modality"] == "SM" and "clinical" in t.get("label", "").lower()
        )
        row["ab_tractable"] = any(
            t["value"] for t in tractability
            if t["modality"] == "AB"
        )

        targets.append(row)

    df = pd.DataFrame(targets).sort_values("overall_score", ascending=False)

    # Prioritization summary
    print(f"Disease: {data['name']}")
    print(f"Total associated targets: {data['associatedTargets']['count']}")
    print(f"Targets above threshold: {len(df)}")
    print(f"Small-molecule tractable: {df['sm_tractable'].sum()}")
    print(f"Antibody tractable: {df['ab_tractable'].sum()}")

    return df

# Prioritize targets for Alzheimer's disease
targets = prioritize_targets("EFO_0000249")
print(targets[["gene", "overall_score", "sm_tractable"]].head(15))

Drug Information for Targets


def get_drugs_for_target(target_id):
    """Get known drugs and clinical candidates for a target."""
    query = """
    query targetDrugs($ensemblId: String!) {
      target(ensemblId: $ensemblId) {
        approvedSymbol
        knownDrugs(size: 50) {
          count
          rows {
            drug { id name drugType maximumClinicalTrialPhase }
            disease { name }
            phase
            status
            urls { name url }
          }
        }
      }
    }
    """
    response = requests.post(
        f"{BASE_URL}/graphql",
        json={"query": query, "variables": {"ensemblId": target_id}}
    )
    data = response.json()["data"]["target"]

    drugs = []
    for row in data["knownDrugs"]["rows"]:
        drugs.append({
            "target": data["approvedSymbol"],
            "drug_name": row["drug"]["name"],
            "drug_type": row["drug"]["drugType"],
            "max_phase": row["drug"]["maximumClinicalTrialPhase"],
            "indication": row["disease"]["name"],
            "trial_phase": row["phase"],
            "status": row["status"]
        })

    df = pd.DataFrame(drugs)
    print(f"Known drugs for {data['approvedSymbol']}: {len(df)}")
    return df

# Get drugs targeting EGFR
drugs = get_drugs_for_target("ENSG00000146648")
print(drugs[["drug_name", "drug_type", "max_phase", "indication"]].head(10))

Configuration

Parameter	Description	Default
`api_url`	Open Targets GraphQL endpoint	`"https://api.platform.opentargets.org/api/v4/graphql"`
`page_size`	Results per query page	`25`
`min_score`	Minimum association score threshold	`0.1`
`evidence_types`	Evidence types to include	All types
`timeout`	API request timeout (seconds)	`30`
`cache`	Cache API responses	`true`

Best Practices

Use Ensembl gene IDs for targets — Open Targets uses Ensembl IDs (e.g., ENSG00000157764 for BRAF) as primary identifiers. Convert gene symbols to Ensembl IDs using the search endpoint before querying, as the same symbol can map to different genes across species.
Evaluate multiple evidence types — A high overall score driven by a single evidence type (e.g., literature mining alone) is weaker than moderate scores across genetics, expression, and known drugs. Check the datatypeScores breakdown to assess evidence breadth.
Filter by tractability for actionable targets — A strongly associated target that cannot be modulated by any known drug modality has limited immediate value. Use tractability assessments to focus on targets that are druggable with small molecules, antibodies, or other approaches.
Cross-reference with ChEMBL for compound data — Open Targets provides target-disease associations, but detailed compound activity data lives in ChEMBL. Use the target's ChEMBL ID to find bioactivity assays, compound structures, and potency data.
Use EFO disease ontology for consistent disease queries — Diseases in Open Targets use the Experimental Factor Ontology (EFO). Search for the correct EFO ID through the platform search before building queries, as disease naming varies across sources.

Common Issues

GraphQL query returns null data — The most common cause is an incorrect Ensembl ID or EFO disease ID. Verify IDs using the Open Targets platform search (https://platform.opentargets.org) before building API queries. Also check that the query structure matches the current API version — the schema evolves between releases.

Association scores seem low for well-known drug targets — Open Targets scores range from 0 to 1 and represent evidence strength, not biological importance. A score of 0.3 with genetic evidence can be more meaningful than a score of 0.8 from literature co-occurrence alone. Always interpret scores in the context of evidence types.

Rate limiting on bulk queries — The Open Targets API is generous but has rate limits for very large queries. When downloading associations for hundreds of targets, add time.sleep(0.5) between requests and use the batch association endpoint when available. For comprehensive dataset downloads, use the Open Targets data downloads (Parquet files) instead of the API.

⚠️ Loading Issue

Opentargets Database Elite

Open Targets Database Elite

When to Use This Skill

Quick Start

Core Concepts

Data Sources and Evidence Types

Target Prioritization Pipeline

Drug Information for Targets

Configuration

Best Practices

Common Issues

Reviews

Write a review

Similar Templates

Full-Stack Code Reviewer

Test Suite Generator

Pro Architecture Workspace