U

Ultimate Alphafold Database

All-in-one skill covering access, alphafold, predicted, protein. Includes structured workflows, validation checks, and reusable patterns for scientific.

SkillClipticsscientificv1.0.0MIT
0 views0 copies

Ultimate AlphaFold Database

A scientific computing skill for querying and working with the AlphaFold Protein Structure Database — the public repository of over 200 million AI-predicted 3D protein structures maintained by DeepMind and EMBL-EBI. Ultimate AlphaFold Database helps you retrieve predicted structures, assess prediction confidence, and integrate structural data into research workflows.

When to Use This Skill

Choose Ultimate AlphaFold Database when:

  • Retrieving predicted protein structures by UniProt accession
  • Assessing prediction confidence (pLDDT scores) for structural regions
  • Downloading structure files (PDB, mmCIF) for downstream analysis
  • Integrating AlphaFold predictions into structural biology pipelines

Consider alternatives when:

  • You need experimental (not predicted) structures (use PDB/RCSB)
  • You want to run AlphaFold predictions yourself (use ColabFold or AlphaFold software)
  • You need protein-protein complex predictions (use AlphaFold-Multimer)
  • You're doing sequence analysis without structural context (use UniProt directly)

Quick Start

claude "Retrieve the AlphaFold predicted structure for human insulin (P01308)"
import requests # Fetch structure metadata uniprot_id = "P01308" # Human insulin url = f"https://alphafold.ebi.ac.uk/api/prediction/{uniprot_id}" response = requests.get(url) data = response.json()[0] print(f"Protein: {data['uniprotDescription']}") print(f"Organism: {data['organismScientificName']}") print(f"PDB URL: {data['pdbUrl']}") print(f"CIF URL: {data['cifUrl']}") print(f"PAE URL: {data['paeImageUrl']}") # Download the PDB file pdb_response = requests.get(data['pdbUrl']) with open(f"AF-{uniprot_id}-F1-model_v4.pdb", 'w') as f: f.write(pdb_response.text)

Core Concepts

AlphaFold API Endpoints

EndpointPurposeExample
/api/prediction/{id}Get structure by UniProt ID/api/prediction/P01308
/api/prediction/{id}?qualifier=F1Specific fragmentFragment 1 of multi-fragment
PDB downloadStructure coordinatesalphafold.ebi.ac.uk/files/AF-{id}-F1-model_v4.pdb
mmCIF downloadStructure in mmCIF formatalphafold.ebi.ac.uk/files/AF-{id}-F1-model_v4.cif
PAE downloadPredicted Aligned Erroralphafold.ebi.ac.uk/files/AF-{id}-F1-predicted_aligned_error_v4.json

Confidence Metrics

from Bio.PDB import PDBParser import numpy as np # Parse AlphaFold PDB — B-factor column contains pLDDT scores parser = PDBParser(QUIET=True) structure = parser.get_structure("protein", f"AF-{uniprot_id}-F1-model_v4.pdb") plddt_scores = [] for atom in structure.get_atoms(): if atom.name == "CA": # Alpha carbons only plddt_scores.append(atom.bfactor) plddt = np.array(plddt_scores) print(f"Mean pLDDT: {plddt.mean():.1f}") print(f"High confidence (>90): {(plddt > 90).sum()} residues") print(f"Confident (70-90): {((plddt >= 70) & (plddt <= 90)).sum()} residues") print(f"Low confidence (50-70): {((plddt >= 50) & (plddt < 70)).sum()} residues") print(f"Very low (<50): {(plddt < 50).sum()} residues")

Confidence Interpretation

pLDDT ScoreConfidenceInterpretation
> 90Very highBackbone and side chains reliable
70 – 90ConfidentBackbone reliable, some side chain uncertainty
50 – 70LowTreat with caution, likely disordered
< 50Very lowLikely intrinsically disordered region

Configuration

ParameterDescriptionDefault
api_base_urlAlphaFold API base URLhttps://alphafold.ebi.ac.uk
model_versionAlphaFold model version4
output_formatPDB or mmCIFpdb
include_paeDownload Predicted Aligned Errorfalse
confidence_thresholdMinimum pLDDT for analysis70

Best Practices

  1. Check pLDDT scores before using a structure. Not all regions of an AlphaFold prediction are equally reliable. Regions with pLDDT < 70 (often loops and termini) should not be used for detailed structural analysis or drug design without experimental validation.

  2. Use the PAE (Predicted Aligned Error) for multi-domain proteins. pLDDT tells you about local confidence, but PAE tells you about the relative positioning of domains. High PAE between two domains means their relative orientation is uncertain — don't trust the inter-domain geometry.

  3. Cross-reference with experimental structures. If an experimental structure exists in the PDB for your protein, compare it with the AlphaFold prediction. Discrepancies highlight regions where the prediction may be less reliable or where the protein adopts different conformations.

  4. Batch downloads for large-scale analysis. For proteome-scale studies, use the bulk download files from the AlphaFold FTP server rather than making individual API calls. EMBL-EBI provides pre-packaged tar files for entire organisms.

  5. Cite AlphaFold appropriately. When using AlphaFold predictions in publications, cite both the AlphaFold method paper and the AlphaFold database paper. Include the model version (v4) and the date of access, as predictions may be updated.

Common Issues

No structure found for a UniProt ID. AlphaFold DB doesn't cover all proteins — it focuses on UniProt reference proteomes. Check that your ID is a valid UniProt accession (not PDB, RefSeq, or GenBank). For proteins not in the database, run AlphaFold locally or use ColabFold.

Structure looks unrealistic in visualization. Regions with pLDDT < 50 often appear as extended, unstructured loops. This doesn't mean the protein is actually extended — these regions are likely intrinsically disordered, and AlphaFold is correctly indicating uncertainty. Don't interpret low-confidence regions as real structural features.

Large protein produces multiple fragment files. AlphaFold splits proteins longer than ~2700 residues into overlapping fragments (F1, F2, etc.). You'll need to stitch fragments together or analyze them separately. The overlap regions can help you align fragments, but inter-fragment domain positioning should be treated with caution.

Community

Reviews

Write a review

No reviews yet. Be the first to review this template!

Similar Templates