Content Experimentation Kit

Structured content A/B testing and experimentation framework for testing headlines, copy variations, page layouts, CTAs, and content strategies with statistical rigor and actionable results.

When to Use This Skill

Choose Content Experimentation when:

Testing headline variations for click-through rate optimization
Comparing page layouts or content structures for engagement
Running multivariate tests on landing pages
Evaluating content strategies with measurable outcomes
Building a data-driven content optimization pipeline

Consider alternatives when:

Need code-level feature flags — use LaunchDarkly or Unleash
Need visual A/B testing — use Optimizely or VWO
Need email testing — use email platform built-in A/B tools

Quick Start


# Activate content experimentation
claude skill activate advanced-content-experimentation-kit

# Design an experiment
claude "Design an A/B test for our pricing page headline and CTA button"

# Analyze results
claude "Analyze the results of experiment EXP-042 and recommend next steps"

Example: Content Experiment Design


interface ContentExperiment {
  id: string;
  name: string;
  hypothesis: string;
  metric: string;
  variants: Variant[];
  trafficSplit: number[];
  duration: { minDays: number; maxDays: number };
  sampleSize: { perVariant: number; confidence: number };
  status: 'draft' | 'running' | 'completed' | 'stopped';
}

interface Variant {
  id: string;
  name: string;
  content: Record<string, string>;
  isControl: boolean;
}

// Example experiment
const pricingExperiment: ContentExperiment = {
  id: 'EXP-042',
  name: 'Pricing Page Headline Test',
  hypothesis: 'Benefit-focused headline will increase conversion by 15%',
  metric: 'pricing_page_to_signup_conversion',
  variants: [
    {
      id: 'control',
      name: 'Current headline',
      content: { headline: 'Simple, transparent pricing' },
      isControl: true,
    },
    {
      id: 'variant_a',
      name: 'Benefit-focused',
      content: { headline: 'Start building for free, scale when ready' },
      isControl: false,
    },
    {
      id: 'variant_b',
      name: 'Social proof',
      content: { headline: 'Join 10,000+ teams who ship faster' },
      isControl: false,
    },
  ],
  trafficSplit: [34, 33, 33],
  duration: { minDays: 14, maxDays: 28 },
  sampleSize: { perVariant: 2000, confidence: 0.95 },
  status: 'running',
};

Core Concepts

Experiment Design

Component	Description	Example
Hypothesis	Testable prediction with expected outcome	"Benefit-focused copy increases signups by 15%"
Primary Metric	Single metric that determines success	Conversion rate, CTR, engagement time
Guardrail Metrics	Metrics that shouldn't degrade	Bounce rate, page load time
Sample Size	Users needed per variant for significance	2,000 per variant (95% confidence)
Duration	Minimum run time for valid results	14 days (full business cycle)
Segmentation	User groups to analyze separately	New vs returning, mobile vs desktop

Statistical Concepts

Concept	Description	Threshold
Statistical Significance	Probability result isn't due to chance	p < 0.05 (95% confidence)
Minimum Detectable Effect	Smallest change worth detecting	5-10% relative improvement
Power	Probability of detecting a real effect	80% minimum
False Positive Rate	Chance of seeing effect that isn't real	5% (α = 0.05)
Confidence Interval	Range of likely true effect sizes	95% CI

Configuration

Parameter	Description	Default
`confidence_level`	Statistical confidence threshold	`0.95`
`min_sample_size`	Minimum sample per variant	`1000`
`max_variants`	Maximum variants per experiment	`4`
`min_duration_days`	Minimum experiment runtime	`7`
`sequential_testing`	Use sequential analysis for early stopping	`true`
`bayesian`	Use Bayesian analysis instead of frequentist	`false`

Best Practices

Test one variable at a time unless running multivariate tests — Changing both the headline and CTA simultaneously makes it impossible to attribute results. Isolate variables for clear causal understanding, or use multivariate testing with sufficient traffic.
Calculate required sample size before starting — Don't start experiments and check results daily hoping for significance. Use a sample size calculator with your baseline conversion rate and minimum detectable effect to determine how long to run the test.
Run experiments for full business cycles — Traffic and behavior vary by day of week. Run experiments for at least 1-2 full weeks to capture weekday and weekend patterns. Stopping mid-week can produce biased results.
Don't peek at results and stop early on significance — Checking daily and stopping when p < 0.05 inflates false positive rates dramatically. Use sequential testing methods or commit to a fixed sample size. Pre-register your analysis plan.
Document and share all experiment results, including negative ones — Failed experiments are as valuable as successful ones. They prevent other teams from testing the same ideas and build organizational knowledge about what your audience responds to.

Common Issues

Experiment shows statistical significance but tiny effect size. A 0.1% improvement can be statistically significant with large sample sizes but isn't practically meaningful. Define a minimum effect size that justifies the implementation effort before starting the experiment.

Results are significant for one segment but not overall. Segment-level analysis increases false positive risk. Pre-register the segments you'll analyze. If you discover unexpected segment differences, treat them as hypotheses for future experiments rather than conclusions.

Winning variant performs worse after full rollout. The experiment may have had a novelty effect, seasonal bias, or the winning variant was only better for the traffic subset during the test. Monitor post-rollout metrics for 2-4 weeks and be ready to revert.

⚠️ Loading Issue

Advanced Content Experimentation Kit

Content Experimentation Kit

When to Use This Skill

Quick Start

Example: Content Experiment Design

Core Concepts

Experiment Design

Statistical Concepts

Configuration

Best Practices

Common Issues

Reviews

Write a review

Similar Templates

Full-Stack Code Reviewer

Test Suite Generator

Pro Architecture Workspace