Advanced Esm Platform
Streamline your workflow with this comprehensive, toolkit, protein, language. Includes structured workflows, validation checks, and reusable patterns for scientific.
Advanced ESM Platform
A scientific computing skill for protein analysis using ESM (Evolutionary Scale Modeling) — Meta AI's family of protein language models that generate embeddings, predict structure, and annotate function from amino acid sequences alone without requiring multiple sequence alignments.
When to Use This Skill
Choose Advanced ESM Platform when:
- Generating protein embeddings for downstream ML tasks
- Predicting protein structure from sequence (ESMFold)
- Computing variant effect predictions for protein engineering
- Performing zero-shot protein function annotation
Consider alternatives when:
- You need the highest accuracy structure prediction (use AlphaFold2)
- You need protein-protein complex prediction (use AlphaFold-Multimer)
- You need protein design/engineering (use ProteinMPNN or RFdiffusion)
- You need sequence alignment or homology search (use BLAST/HHblits)
Quick Start
claude "Generate ESM embeddings for a protein sequence and predict structure"
import torch import esm # Load ESM-2 model model, alphabet = esm.pretrained.esm2_t33_650M_UR50D() batch_converter = alphabet.get_batch_converter() model.eval() # Prepare sequence data = [ ("protein1", "MKTAYIAKQRQISFVKSHFSRQLEERLGLIEVQAPILSRVGD") ] batch_labels, batch_strs, batch_tokens = batch_converter(data) # Generate embeddings with torch.no_grad(): results = model(batch_tokens, repr_layers=[33], return_contacts=True) # Per-residue embeddings (for downstream tasks) embeddings = results["representations"][33] print(f"Embedding shape: {embeddings.shape}") # (batch, seq_len, 1280) # Contact prediction contacts = results["contacts"] print(f"Contact map shape: {contacts.shape}")
Core Concepts
ESM Model Family
| Model | Parameters | Use Case |
|---|---|---|
| ESM-2 (8M) | 8M | Fast embedding, limited tasks |
| ESM-2 (150M) | 150M | Good balance of speed and quality |
| ESM-2 (650M) | 650M | High-quality embeddings |
| ESM-2 (3B) | 3B | Best embeddings (GPU required) |
| ESMFold | — | Structure prediction from sequence |
| ESM-1v | 650M | Variant effect prediction |
ESMFold Structure Prediction
import esm # Load ESMFold model = esm.pretrained.esmfold_v1() model = model.eval().cuda() # Predict structure sequence = "MKTAYIAKQRQISFVKSHFSRQLEERLGLIEVQAPILSRVGD" with torch.no_grad(): output = model.infer_pdb(sequence) # Save PDB file with open("prediction.pdb", "w") as f: f.write(output) # Check confidence (pLDDT) with torch.no_grad(): output = model.infer(sequence) plddt = output["plddt"].mean().item() print(f"Mean pLDDT: {plddt:.1f}")
Variant Effect Prediction
import esm # Load ESM-1v for variant scoring model, alphabet = esm.pretrained.esm1v_t33_650M_UR90S_1() batch_converter = alphabet.get_batch_converter() model.eval() # Score mutations using masked marginal probability def score_variant(sequence, position, wt_aa, mut_aa): """Score a single amino acid substitution""" data = [("protein", sequence)] _, _, tokens = batch_converter(data) # Mask the position of interest tokens[0, position + 1] = alphabet.mask_idx # +1 for BOS token with torch.no_grad(): logits = model(tokens)["logits"] # Log-likelihood ratio wt_score = logits[0, position + 1, alphabet.get_idx(wt_aa)].item() mut_score = logits[0, position + 1, alphabet.get_idx(mut_aa)].item() return mut_score - wt_score # Positive = favorable mutation score = score_variant("MKTAYIAKQRQ...", 5, "I", "V") print(f"I5V score: {score:.3f}")
Configuration
| Parameter | Description | Default |
|---|---|---|
model_name | ESM model variant | esm2_t33_650M_UR50D |
repr_layers | Embedding layers to extract | [33] (last layer) |
return_contacts | Compute contact predictions | False |
device | CPU or CUDA GPU | Auto-detect |
batch_size | Sequences per batch | 1 |
Best Practices
-
Choose model size based on your GPU memory. ESM-2 650M requires ~4GB GPU memory, 3B requires ~12GB. If you lack a GPU, use the 150M model on CPU — it's slower but still produces useful embeddings.
-
Use the last layer for general-purpose embeddings. Layer 33 (final) embeddings capture the most abstract protein features. Earlier layers capture more local sequence patterns. For contact prediction, use the model's built-in contact head.
-
Average per-residue embeddings for protein-level features. For classification tasks (function prediction, localization), mean-pool the per-residue embeddings to get a fixed-size protein representation. This works better than using only the CLS token.
-
Use ESMFold for speed, AlphaFold2 for accuracy. ESMFold predicts structure in seconds without MSA computation, making it ideal for screening and rapid prototyping. For publication-quality structures, validate top candidates with AlphaFold2.
-
Batch sequences of similar length. Padding short sequences to match long ones in a batch wastes computation. Group sequences by length and process each group separately for optimal throughput.
Common Issues
CUDA out of memory on long sequences. ESM models' memory usage scales quadratically with sequence length. For proteins >1000 residues, use the 150M model instead of 650M, or split into domains. ESMFold has a practical limit around 400 residues on consumer GPUs.
Embedding quality seems poor for short peptides. ESM was trained on full-length protein sequences. Very short peptides (<20 residues) may not produce meaningful embeddings. For short peptides, consider using specialized peptide models.
ESMFold pLDDT is low for the entire protein. Low overall pLDDT may indicate an intrinsically disordered protein or a multi-domain protein where ESMFold struggles with inter-domain positioning. Check per-residue pLDDT to identify which regions are confident and which are uncertain.
Reviews
No reviews yet. Be the first to review this template!
Similar Templates
Full-Stack Code Reviewer
Comprehensive code review skill that checks for security vulnerabilities, performance issues, accessibility, and best practices across frontend and backend code.
Test Suite Generator
Generates comprehensive test suites with unit tests, integration tests, and edge cases. Supports Jest, Vitest, Pytest, and Go testing.
Pro Architecture Workspace
Battle-tested skill for architectural, decision, making, framework. Includes structured workflows, validation checks, and reusable patterns for development.