S

Statistical Analysis Dynamic

Enterprise-grade skill for statistical, analysis, toolkit, hypothesis. Includes structured workflows, validation checks, and reusable patterns for scientific.

SkillClipticsscientificv1.0.0MIT
0 views0 copies

Statistical Analysis Dynamic

Conduct rigorous statistical analysis with hypothesis testing, regression modeling, ANOVA, non-parametric tests, and effect size estimation using Python. This skill covers test selection, assumption checking, multiple comparison correction, power analysis, and result reporting in APA format.

When to Use This Skill

Choose Statistical Analysis Dynamic when you need to:

  • Select and execute the appropriate statistical test for your experimental design
  • Check test assumptions (normality, homoscedasticity, independence) and choose alternatives
  • Perform multiple comparison corrections (Bonferroni, FDR, Tukey HSD)
  • Report results with effect sizes, confidence intervals, and proper statistical formatting

Consider alternatives when:

  • You need Bayesian inference and posterior distributions (use PyMC)
  • You need time-series specific methods (use statsmodels TSA module)
  • You need machine learning predictive models (use scikit-learn)

Quick Start

pip install scipy statsmodels pingouin numpy pandas
import numpy as np import pandas as pd from scipy import stats import pingouin as pg # Generate sample data np.random.seed(42) control = np.random.normal(loc=100, scale=15, size=30) treatment = np.random.normal(loc=110, scale=15, size=30) # 1. Check normality _, p_control = stats.shapiro(control) _, p_treatment = stats.shapiro(treatment) print(f"Normality p-values: control={p_control:.3f}, treatment={p_treatment:.3f}") # 2. Check equal variances _, p_levene = stats.levene(control, treatment) print(f"Levene's test p-value: {p_levene:.3f}") # 3. Independent samples t-test t_stat, p_value = stats.ttest_ind(control, treatment) cohens_d = (treatment.mean() - control.mean()) / np.sqrt( (control.std()**2 + treatment.std()**2) / 2 ) print(f"\nt({len(control)+len(treatment)-2}) = {t_stat:.3f}, p = {p_value:.3f}") print(f"Cohen's d = {cohens_d:.3f}") print(f"Control: M = {control.mean():.1f}, SD = {control.std():.1f}") print(f"Treatment: M = {treatment.mean():.1f}, SD = {treatment.std():.1f}") # Using pingouin for comprehensive output result = pg.ttest(treatment, control, paired=False) print(f"\n{result.to_string()}")

Core Concepts

Test Selection Guide

DesignParametric TestNon-parametric Alternative
2 independent groupsIndependent t-testMann-Whitney U
2 paired groupsPaired t-testWilcoxon signed-rank
3+ independent groupsOne-way ANOVAKruskal-Wallis H
3+ paired groupsRepeated measures ANOVAFriedman test
2 categorical variablesChi-square testFisher's exact test
CorrelationPearson rSpearman rho / Kendall tau
Prediction (continuous)Linear regression
Prediction (binary)Logistic regression
2+ factorsFactorial ANOVAAligned rank transform

ANOVA with Post-hoc Tests

import numpy as np import pandas as pd from scipy import stats import pingouin as pg # Create data for 4 treatment groups np.random.seed(42) data = pd.DataFrame({ 'score': np.concatenate([ np.random.normal(50, 10, 25), np.random.normal(55, 10, 25), np.random.normal(60, 10, 25), np.random.normal(52, 10, 25), ]), 'group': np.repeat(['Placebo', 'Low Dose', 'High Dose', 'Combination'], 25) }) # Check ANOVA assumptions # Normality per group for group in data['group'].unique(): subset = data[data['group'] == group]['score'] _, p = stats.shapiro(subset) print(f" {group}: Shapiro p = {p:.3f}") # Homogeneity of variances _, p_levene = stats.levene(*[data[data['group'] == g]['score'] for g in data['group'].unique()]) print(f"Levene's p = {p_levene:.3f}") # One-way ANOVA aov = pg.anova(data=data, dv='score', between='group', detailed=True) print(f"\nANOVA Results:") print(aov.to_string()) # Post-hoc: Tukey HSD posthoc = pg.pairwise_tukey(data=data, dv='score', between='group') print(f"\nTukey HSD Post-hoc:") print(posthoc[['A', 'B', 'diff', 'p-tukey', 'hedges']].to_string())

Configuration

ParameterDescriptionDefault
alphaSignificance level0.05
alternativeTest directionality (two-sided, greater, less)"two-sided"
correctionMultiple comparison method (bonferroni, fdr_bh, holm)"fdr_bh"
effect_sizeEffect size measure (cohen_d, eta_squared, r)Test-dependent
confidence_levelCI level for estimates0.95
normality_testNormality check method"shapiro"
variance_testHomoscedasticity test"levene"
power_targetTarget statistical power0.80

Best Practices

  1. Always check assumptions before running parametric tests — Run Shapiro-Wilk for normality and Levene's test for homoscedasticity. If assumptions are violated (p < 0.05), use non-parametric alternatives or robust methods. Violating assumptions inflates Type I error rates and produces unreliable p-values.

  2. Report effect sizes alongside p-values — P-values depend on sample size and don't indicate practical importance. Always report Cohen's d for t-tests (0.2=small, 0.5=medium, 0.8=large), eta-squared for ANOVA, and odds ratios for logistic regression. A statistically significant result with a tiny effect size may not be meaningful.

  3. Apply multiple comparison correction when testing more than one hypothesis — Without correction, testing 20 comparisons at α=0.05 yields one false positive on average. Use Benjamini-Hochberg FDR for exploratory analyses (controls false discovery rate) and Bonferroni for confirmatory analyses (controls family-wise error rate).

  4. Conduct power analysis before collecting data — Use statsmodels.stats.power or pingouin.power_ttest to determine required sample sizes. Underpowered studies waste resources and produce unreliable results. Target at least 80% power for the minimum meaningful effect size in your domain.

  5. Use pingouin for clean, comprehensive statistical output — Pingouin provides publication-ready output with effect sizes, confidence intervals, and Bayesian factors in a single function call. It's more concise than scipy.stats and includes assumption checks. Use it for routine statistical analysis and scipy for custom procedures.

Common Issues

P-value is exactly 0.000 or displays as 0 — Very small p-values underflow floating-point precision. Report as "p < 0.001" rather than "p = 0.000". For exact values, use stats.ttest_ind(a, b) which returns a float, then format: f"p = {p:.2e}" for scientific notation (e.g., p = 3.45e-12).

ANOVA is significant but no post-hoc pairs are significant — ANOVA's omnibus test has more power than pairwise comparisons because it tests the global null. Post-hoc corrections (Tukey, Bonferroni) further reduce power. This is normal with borderline significance. Report the ANOVA result and note that specific pairwise differences couldn't be isolated.

Non-parametric tests disagree with parametric tests — Parametric and non-parametric tests answer slightly different questions (means vs. distributions/ranks). Disagreement often means the effect is weak or assumption-dependent. Report both results transparently and let the reader evaluate which is more appropriate for the data.

Community

Reviews

Write a review

No reviews yet. Be the first to review this template!

Similar Templates