---
name: ab-test-plan
description: Designs a rigorous A/B test plan with hypothesis, sample size, duration, and success criteria. Use when planning split tests, experiment design, or validating product changes.
metadata:
  category: analytics-data
  author: skillar
  version: "1.0"
---

# A/B Test Plan Designer

*Because an A/B test without a hypothesis is just random changes.*

> **Usage:** Copy this skill into Claude → replace [BRACKETS] with your details → get polished output.

## What You Get
A complete A/B test plan document including a falsifiable hypothesis, variant descriptions, sample size calculations, test duration estimate, success criteria, and a rollout checklist ready for your engineering and product teams.

## Instructions

You are a experimentation scientist who has designed thousands of A/B tests for high-growth startups and Fortune 500 companies. You combine statistical rigor with practical business sense, ensuring every test is both scientifically valid and operationally feasible.

Create a comprehensive A/B test plan for the following:

- **Product/Feature:** [PRODUCT OR FEATURE BEING TESTED]
- **Change being tested:** [DESCRIBE THE VARIANT — what is different from the control]
- **Primary business goal:** [WHAT BUSINESS OUTCOME ARE YOU TRYING TO IMPROVE]
- **Current baseline metric:** [CURRENT VALUE OF THE PRIMARY METRIC, e.g., 3.2% conversion rate]
- **Minimum detectable effect:** [SMALLEST IMPROVEMENT WORTH DETECTING, e.g., 10% relative lift]
- **Monthly traffic/users:** [APPROXIMATE MONTHLY VOLUME AVAILABLE FOR THE TEST]
- **Platform/tool:** [TESTING TOOL IF ANY, e.g., Optimizely, LaunchDarkly, in-house]

## 1. HYPOTHESIS FORMULATION
- Write a clear, falsifiable hypothesis in "If we [change], then [metric] will [direction] because [rationale]" format
- Identify the primary metric (one only) and 2-3 secondary metrics to monitor
- Define guardrail metrics that must not degrade during the test
- State the null hypothesis explicitly so the team knows what "no result" looks like

## 2. EXPERIMENT DESIGN
- Specify control vs. variant(s) with precise descriptions of each experience
- Recommend traffic allocation split and justify the ratio
- Identify the randomization unit (user, session, device) and explain why
- Flag any audience segments that should be excluded or analyzed separately
- Note any interaction effects with other running experiments

## 3. SAMPLE SIZE AND DURATION
- Calculate the required sample size per variant using the baseline rate and minimum detectable effect
- State the statistical significance level (default 95%) and power (default 80%)
- Estimate test duration based on available traffic, accounting for weekly seasonality
- Recommend whether to use fixed-horizon or sequential testing and explain the trade-off
- Include a sensitivity table showing duration at different MDE levels

## 4. IMPLEMENTATION CHECKLIST
- List technical requirements for instrumentation and event tracking
- Define QA steps to verify correct bucketing and metric logging
- Specify a burn-in period and criteria to confirm data quality before analysis
- Outline the escalation plan if a variant causes a severe negative impact

## 5. ANALYSIS PLAN
- Pre-register the analysis method (frequentist, Bayesian, or sequential)
- Define how to handle multiple comparisons if testing more than one variant
- Specify segmentation cuts to run post-hoc (device, geography, new vs. returning)
- Describe how to check for novelty effects and Simpson's paradox
- State the decision rule: what result leads to ship, iterate, or kill

## 6. REPORTING AND ROLLOUT
- Provide a results summary template with key numbers and confidence intervals
- Recommend a phased rollout plan (e.g., 25% → 50% → 100%) with monitoring at each stage
- Include a one-paragraph executive summary template for stakeholders
- Define the documentation standard so the test becomes institutional knowledge

End the plan with a "Pre-Launch Sanity Check" — a 5-item checklist the team must verify before flipping the experiment live.

Be specific to my situation. No generic filler.
