Etetoolkit Engine
Comprehensive skill designed for phylogenetic, tree, toolkit, manipulation. Includes structured workflows, validation checks, and reusable patterns for scientific.
ETE Toolkit Engine
A scientific computing skill for phylogenetic tree analysis using ETE (Environment for Tree Exploration) — the Python toolkit for manipulating, analyzing, and visualizing phylogenetic and hierarchical trees with publication-quality rendering.
When to Use This Skill
Choose ETE Toolkit Engine when:
- Parsing, manipulating, and querying phylogenetic trees (Newick, NHX, PhyloXML)
- Visualizing trees with custom node styles, colors, and annotations
- Performing phylogenetic comparisons (Robinson-Foulds distance, topology tests)
- Building automated phylogenetic analysis pipelines
Consider alternatives when:
- You need tree inference from sequences (use IQ-TREE, RAxML, or MrBayes)
- You need sequence alignment (use MAFFT, MUSCLE, or ClustalW)
- You need interactive web-based visualization (use iTOL or Nextstrain)
- You need simple tree plotting without analysis (use Bio.Phylo from BioPython)
Quick Start
claude "Load a Newick tree, annotate nodes, and render a publication figure"
from ete3 import Tree, TreeStyle, NodeStyle, TextFace, CircleFace # Load a phylogenetic tree tree = Tree("((A:0.1,B:0.2)90:0.3,(C:0.15,D:0.25)85:0.4)100;", format=0) print(tree.get_ascii(show_internal=True)) print(f"Leaves: {len(tree)}") print(f"Internal nodes: {len(list(tree.traverse())) - len(tree)}") # Traverse and annotate for node in tree.traverse(): if node.is_leaf(): # Add species label face = TextFace(node.name, fsize=12, fgcolor="black") node.add_face(face, column=0, position="branch-right") else: # Color internal nodes by support if node.support >= 90: style = NodeStyle() style["fgcolor"] = "green" style["size"] = 8 node.set_style(style) # Render ts = TreeStyle() ts.show_leaf_name = False # We added custom faces ts.show_branch_support = True ts.branch_vertical_margin = 15 tree.render("phylogeny.pdf", tree_style=ts, w=800)
Core Concepts
Tree Operations
| Operation | Method | Description |
|---|---|---|
| Load tree | Tree(newick_string) | Parse Newick/NHX format |
| Traverse | tree.traverse("postorder") | Visit all nodes |
| Get leaves | tree.get_leaves() | Terminal nodes only |
| Find node | tree.search_nodes(name="A") | Search by attributes |
| Get ancestor | tree.get_common_ancestor("A", "B") | MRCA of taxa |
| Prune | tree.prune(["A", "B", "C"]) | Keep only listed leaves |
| Root | tree.set_outgroup("A") | Reroot the tree |
| Distance | tree.get_distance("A", "B") | Branch length distance |
Tree Comparison
from ete3 import Tree t1 = Tree("((A,B),(C,D));") t2 = Tree("((A,C),(B,D));") # Robinson-Foulds distance rf, max_rf, _, _, _, _, _ = t1.robinson_foulds(t2, unrooted_trees=True) print(f"RF distance: {rf}") print(f"Normalized RF: {rf/max_rf:.2f}") # Topology comparison result = t1.compare(t2, unrooted=True) print(f"Source edges: {result['source_edges_in_ref']}") print(f"Ref edges: {result['ref_edges_in_source']}")
Phylogenetic Workflows
from ete3 import PhyloTree, EvolTree # Load alignment and tree together ptree = PhyloTree("tree.nw") ptree.link_to_alignment("alignment.fasta") # Get species-gene reconciliation ptree.set_species_naming_function(lambda x: x.split("_")[0]) recon = ptree.get_speciation_trees() # Evolutionary analysis with CodeML (dN/dS) etree = EvolTree("tree.nw") etree.link_to_alignment("codon_alignment.fasta") etree.run_model("M0") # One-ratio model etree.run_model("M1") # Nearly neutral etree.run_model("M2") # Positive selection # Compare models pvalue = etree.get_most_likely("M2", "M1") print(f"Selection test p-value: {pvalue}")
Configuration
| Parameter | Description | Default |
|---|---|---|
tree_format | Newick format variant (0-9) | 0 |
quoted_node_names | Handle quoted names | False |
render_engine | Qt or SVG rendering | Qt |
output_format | PDF, PNG, SVG | PDF |
branch_length_mode | Show branch lengths or not | True |
Best Practices
-
Specify the correct Newick format. ETE supports 10 Newick format variants (0-9) with different conventions for internal names, support values, and branch lengths. Using the wrong format misparses node labels. Format 0 is most common.
-
Use
traverse()with strategy parameter."postorder"(leaves first) is best for bottom-up calculations like computing clade sizes."preorder"(root first) is better for top-down annotation."levelorder"visits nodes by depth. -
Prune trees before comparison. When comparing trees with different leaf sets, prune both to their shared taxa first. Robinson-Foulds distance is undefined for trees with different leaf sets.
-
Use TreeStyle for publication figures. Customize branch colors, node sizes, and label positions through TreeStyle and NodeStyle rather than post-processing in image editors. ETE produces vector output (PDF/SVG) that scales perfectly for publications.
-
Cache large trees in pickle format. Parsing very large Newick strings is slow. After first parsing, save the ETE tree object with Python's
picklemodule for faster subsequent loading.
Common Issues
Tree rendering fails with Qt errors. ETE3 uses Qt for rendering. On headless servers, install xvfb and run with xvfb-run python script.py, or use tree.render() with SVG output which doesn't require a display. ETE4 improves headless rendering.
Node names with special characters cause parsing errors. Newick format uses parentheses, commas, colons, and semicolons as delimiters. Node names containing these characters must be quoted. Use quoted_node_names=True when loading such trees.
Robinson-Foulds distance seems wrong. Check that both trees are rooted/unrooted consistently. Set unrooted_trees=True for unrooted comparison. Also verify that leaf names match exactly between trees — different naming conventions (spaces, underscores) cause mismatches.
Reviews
No reviews yet. Be the first to review this template!
Similar Templates
Full-Stack Code Reviewer
Comprehensive code review skill that checks for security vulnerabilities, performance issues, accessibility, and best practices across frontend and backend code.
Test Suite Generator
Generates comprehensive test suites with unit tests, integration tests, and edge cases. Supports Jest, Vitest, Pytest, and Go testing.
Pro Architecture Workspace
Battle-tested skill for architectural, decision, making, framework. Includes structured workflows, validation checks, and reusable patterns for development.