Ultimate Scientific Framework

Build reproducible scientific computing workflows using Python's ecosystem of research tools. This skill covers experiment design, data collection pipelines, statistical analysis, reproducibility practices, and publication-ready output generation for computational research.

When to Use This Skill

Choose Ultimate Scientific Framework when you need to:

Structure computational experiments with proper version control and reproducibility
Build end-to-end data analysis pipelines from raw data to publication figures
Apply rigorous statistical methods with appropriate controls and power analysis
Generate reproducible research artifacts (notebooks, figures, supplementary data)

Consider alternatives when:

You need specific domain tools (use domain-specific skills like Scanpy, PyDESeq2, etc.)
You need high-performance computing workflows (use Snakemake or Nextflow)
You need collaborative experiment management (use platforms like DVC or MLflow)

Quick Start


pip install numpy scipy pandas matplotlib seaborn statsmodels jupyter


import numpy as np
import pandas as pd
from scipy import stats
import matplotlib.pyplot as plt

# Reproducible experiment setup
np.random.seed(42)

# Load and validate data
data = pd.read_csv("experiment_results.csv")
print(f"Samples: {len(data)}")
print(f"Variables: {list(data.columns)}")
print(f"Missing values:\n{data.isnull().sum()}")

# Statistical test with effect size
group_a = data[data["condition"] == "control"]["measurement"]
group_b = data[data["condition"] == "treatment"]["measurement"]

t_stat, p_value = stats.ttest_ind(group_a, group_b)
cohens_d = (group_b.mean() - group_a.mean()) / np.sqrt(
    (group_a.std()**2 + group_b.std()**2) / 2
)

print(f"t = {t_stat:.3f}, p = {p_value:.4f}")
print(f"Cohen's d = {cohens_d:.3f}")
print(f"Effect: {'Small' if abs(cohens_d) < 0.5 else 'Medium' if abs(cohens_d) < 0.8 else 'Large'}")

Core Concepts

Research Workflow Components

Phase	Tools	Output
Data collection	pandas, requests, APIs	Raw datasets
Cleaning	pandas, numpy	Validated data
Exploration	matplotlib, seaborn	Exploratory plots
Analysis	scipy, statsmodels, scikit-learn	Statistical results
Visualization	matplotlib, seaborn, plotly	Publication figures
Reporting	jupyter, LaTeX, markdown	Papers, reports
Reproducibility	git, conda/pip, DVC	Version-controlled artifacts

Experiment Design with Power Analysis


from scipy import stats
from statsmodels.stats.power import TTestIndPower
import numpy as np

def power_analysis(effect_size, alpha=0.05, power=0.8):
    """Calculate required sample size for a two-sample t-test."""
    analysis = TTestIndPower()
    n = analysis.solve_power(
        effect_size=effect_size,
        alpha=alpha,
        power=power,
        alternative="two-sided"
    )
    return int(np.ceil(n))

# Sample sizes for different effect sizes
for d in [0.2, 0.5, 0.8]:
    n = power_analysis(d)
    print(f"Effect size d={d}: need n={n} per group")

# Power curve visualization
import matplotlib.pyplot as plt
effect_sizes = np.linspace(0.1, 1.5, 50)
sample_sizes = [power_analysis(d) for d in effect_sizes]

fig, ax = plt.subplots(figsize=(8, 5))
ax.plot(effect_sizes, sample_sizes, "b-", linewidth=2)
ax.set_xlabel("Effect Size (Cohen's d)")
ax.set_ylabel("Required Sample Size (per group)")
ax.set_title("Power Analysis: n vs Effect Size (α=0.05, power=0.8)")
ax.grid(True, alpha=0.3)
fig.savefig("power_analysis.png", dpi=200, bbox_inches="tight")

Reproducible Analysis Pipeline


import hashlib
import json
from datetime import datetime
from pathlib import Path

class ExperimentTracker:
    """Track experiment parameters, data, and results for reproducibility."""

    def __init__(self, experiment_name, output_dir="experiments"):
        self.name = experiment_name
        self.output_dir = Path(output_dir) / experiment_name
        self.output_dir.mkdir(parents=True, exist_ok=True)
        self.log = {
            "name": experiment_name,
            "started": datetime.now().isoformat(),
            "parameters": {},
            "data_checksums": {},
            "results": {}
        }

    def log_parameters(self, **params):
        self.log["parameters"].update(params)

    def log_data(self, name, data_path):
        with open(data_path, "rb") as f:
            checksum = hashlib.md5(f.read()).hexdigest()
        self.log["data_checksums"][name] = {
            "path": str(data_path),
            "md5": checksum
        }

    def log_result(self, name, value):
        self.log["results"][name] = value

    def save(self):
        self.log["completed"] = datetime.now().isoformat()
        log_path = self.output_dir / "experiment_log.json"
        with open(log_path, "w") as f:
            json.dump(self.log, f, indent=2, default=str)
        print(f"Experiment log saved to {log_path}")

# Usage
tracker = ExperimentTracker("drug_response_analysis")
tracker.log_parameters(
    test="two-sample t-test",
    alpha=0.05, correction="bonferroni",
    random_seed=42
)
tracker.log_data("raw_data", "experiment_results.csv")
tracker.log_result("p_value", 0.003)
tracker.log_result("effect_size", 0.72)
tracker.save()

Configuration

Parameter	Description	Default
`random_seed`	Global random seed for reproducibility	`42`
`alpha`	Statistical significance threshold	`0.05`
`power`	Desired statistical power	`0.8`
`correction`	Multiple testing correction method	`"bonferroni"`
`figure_dpi`	Publication figure resolution	`300`
`output_format`	Report format (pdf, html, latex)	`"pdf"`

Best Practices

Set random seeds at the start of every analysis — Use np.random.seed(42) and equivalent for all libraries at the very beginning of your script. This ensures results are reproducible across runs and machines. Document the seed in your methods section.
Report effect sizes alongside p-values — A tiny p-value with a negligible effect size is not scientifically meaningful. Always calculate and report Cohen's d, correlation coefficients, or other appropriate effect size measures. Focus on practical significance, not just statistical significance.
Version control everything — Track code, parameters, and data checksums with git and experiment logs. When a reviewer asks "how did you get this number?", you should be able to trace any result back to the exact code, data, and parameters that produced it.
Validate data before analysis — Check for missing values, outliers, data type issues, and distributional assumptions before running statistical tests. Document any data exclusions with clear criteria and report the number of excluded data points.
Use appropriate statistical tests for your data — Don't default to t-tests. Check normality (Shapiro-Wilk), equal variances (Levene's), and independence assumptions. Use non-parametric alternatives (Mann-Whitney U, Kruskal-Wallis) when assumptions are violated.

Common Issues

Results change between runs — Missing or incomplete random seed setting. Set seeds for numpy, Python's random module, and any ML libraries. Some operations (GPU computations, multi-threading) may introduce non-determinism — use deterministic mode flags when available.

P-values are significant but effects are tiny — With large sample sizes, even trivial differences become statistically significant. Report confidence intervals and effect sizes to convey practical importance. A 0.1% difference with p < 0.001 in a sample of 1M is statistically but not practically significant.

Analysis code doesn't reproduce on a different machine — Environment differences (package versions, OS, floating-point behavior) cause subtle result changes. Use conda env export > environment.yml or pip freeze > requirements.txt to lock exact versions. Test reproducibility on a clean environment before publication.

⚠️ Loading Issue

Ultimate Scientific Framework

Ultimate Scientific Framework

When to Use This Skill

Quick Start

Core Concepts

Research Workflow Components

Experiment Design with Power Analysis

Reproducible Analysis Pipeline

Configuration

Best Practices

Common Issues

Reviews

Write a review

Similar Templates

Full-Stack Code Reviewer

Test Suite Generator

Pro Architecture Workspace