Master Latchbio Suite
Comprehensive skill designed for latch, platform, bioinformatics, workflows. Includes structured workflows, validation checks, and reusable patterns for scientific.
Master LatchBio Suite
Build and deploy bioinformatics workflows on the Latch platform using the Latch SDK. This skill covers workflow definition with Flyte tasks, container configuration, data management through Latch Data, and deployment of reproducible computational biology pipelines accessible through an auto-generated web interface.
When to Use This Skill
Choose Master LatchBio Suite when you need to:
- Deploy bioinformatics pipelines with auto-generated web UIs for non-programmers
- Run containerized workflows on managed cloud compute without infrastructure setup
- Share reproducible analysis workflows across a research team or organization
- Process large genomics datasets (FASTQ, BAM, VCF) with scalable cloud resources
Consider alternatives when:
- You need a general-purpose workflow orchestrator (use Nextflow or Snakemake)
- You need on-premises pipeline execution (use CWL or WDL runners)
- You need real-time data streaming rather than batch processing (use Apache Kafka)
Quick Start
# Install Latch SDK pip install latch # Initialize a new workflow latch init my-alignment-workflow cd my-alignment-workflow
# wf/__init__.py from latch import workflow, small_task from latch.types import LatchFile, LatchDir from enum import Enum class Aligner(Enum): BWA = "bwa" BOWTIE2 = "bowtie2" @small_task def align_reads( reads: LatchFile, reference: LatchFile, aligner: Aligner = Aligner.BWA, threads: int = 4 ) -> LatchFile: """Align sequencing reads to a reference genome.""" import subprocess output = "/root/aligned.bam" if aligner == Aligner.BWA: subprocess.run([ "bwa", "mem", "-t", str(threads), reference.local_path, reads.local_path ], stdout=open(output, "w"), check=True) return LatchFile(output, f"latch:///results/aligned.bam") @workflow def alignment_workflow( reads: LatchFile, reference: LatchFile, aligner: Aligner = Aligner.BWA ) -> LatchFile: """Align reads to reference genome. This workflow takes FASTQ reads and a reference genome, performs alignment, and returns a sorted BAM file. """ return align_reads(reads=reads, reference=reference, aligner=aligner)
# Register and deploy latch register --remote my-alignment-workflow
Core Concepts
SDK Components
| Component | Purpose | Example |
|---|---|---|
@small_task | Task on 2 CPU / 4 GB RAM | Quality control, file parsing |
@medium_task | Task on 8 CPU / 32 GB RAM | Read alignment, variant calling |
@large_task | Task on 31 CPU / 120 GB RAM | Genome assembly, large-scale analysis |
@workflow | Orchestrates tasks into DAG | Full analysis pipeline |
LatchFile | Single file reference | FASTQ, BAM, VCF |
LatchDir | Directory reference | Multi-file outputs |
Multi-Step Pipeline
from latch import workflow, small_task, medium_task from latch.types import LatchFile, LatchDir @small_task def quality_control(reads: LatchFile) -> LatchFile: """Run FastQC on input reads.""" import subprocess subprocess.run(["fastqc", reads.local_path, "-o", "/root/qc/"], check=True) return LatchFile("/root/qc/", "latch:///results/qc/") @medium_task def align_and_sort(reads: LatchFile, reference: LatchFile) -> LatchFile: """Align reads and sort the output BAM.""" import subprocess subprocess.run( f"bwa mem -t 8 {reference.local_path} {reads.local_path} " f"| samtools sort -@ 4 -o /root/sorted.bam", shell=True, check=True ) subprocess.run(["samtools", "index", "/root/sorted.bam"], check=True) return LatchFile("/root/sorted.bam", "latch:///results/sorted.bam") @small_task def call_variants(bam: LatchFile, reference: LatchFile) -> LatchFile: """Call variants using bcftools.""" import subprocess subprocess.run( f"bcftools mpileup -f {reference.local_path} {bam.local_path} " f"| bcftools call -mv -o /root/variants.vcf", shell=True, check=True ) return LatchFile("/root/variants.vcf", "latch:///results/variants.vcf") @workflow def variant_calling_pipeline( reads: LatchFile, reference: LatchFile ) -> LatchFile: """Complete variant calling pipeline: QC → Align → Call.""" quality_control(reads=reads) bam = align_and_sort(reads=reads, reference=reference) return call_variants(bam=bam, reference=reference)
Configuration
| Parameter | Description | Default |
|---|---|---|
task_size | Compute resources (small/medium/large/gpu) | small |
dockerfile | Custom Dockerfile for dependencies | Auto-generated |
latch_data_path | Output path in Latch Data | "latch:///" |
timeout | Maximum task execution time | 7200 (2 hours) |
retries | Number of retry attempts on failure | 0 |
cache_version | Version string for task caching | "v1" |
Best Practices
-
Choose the right task size — Start with
@small_taskand only upgrade to@medium_taskor@large_taskwhen your process actually needs more resources. Oversized tasks waste compute credits and queue behind other large jobs. -
Pin all software versions in Dockerfile — Specify exact versions for every tool (
samtools==1.17,bwa==0.7.17) in your Dockerfile. Unpinned versions cause silent result differences when tools auto-update between deployments. -
Use type annotations for the web UI — Latch auto-generates the web interface from your function signatures. Use
Enumfor dropdown menus,intwith defaults for number inputs, and docstrings for parameter descriptions. -
Store intermediate files in /root — Write temporary and intermediate files to
/root/within tasks, not to/tmp/. Latch tasks use/rootas the writable workspace, and/tmpmay have size limits on some instance types. -
Test locally before registering — Run
latch local-executeto test your workflow with small data before deploying to the cloud. This catches import errors, missing dependencies, and logic bugs without consuming cloud compute time.
Common Issues
Registration fails with Docker build errors — The most common cause is missing system dependencies in the Dockerfile. If your Python package requires C libraries (e.g., htslib for pysam), add RUN apt-get install -y libhts-dev to your Dockerfile before the pip install step.
LatchFile not found at runtime — Latch files are downloaded lazily when you access .local_path. If you construct the file path as a string instead of using the .local_path property, the file won't be downloaded. Always access data through the LatchFile object, never by constructing paths manually.
Workflow runs succeed but outputs are empty — This happens when the LatchFile return path doesn't match where the tool actually wrote its output. Print the working directory and list files in your task to debug: os.listdir("/root/"). Ensure the local path in LatchFile(local_path, remote_path) matches the actual output location.
Reviews
No reviews yet. Be the first to review this template!
Similar Templates
Full-Stack Code Reviewer
Comprehensive code review skill that checks for security vulnerabilities, performance issues, accessibility, and best practices across frontend and backend code.
Test Suite Generator
Generates comprehensive test suites with unit tests, integration tests, and edge cases. Supports Jest, Vitest, Pytest, and Go testing.
Pro Architecture Workspace
Battle-tested skill for architectural, decision, making, framework. Includes structured workflows, validation checks, and reusable patterns for development.