Sample size calculator for surveys (Cochran formula with finite-population correction) and A/B tests (power analysis for proportions). Get the minimum n needed.

RT-CNV-093 · Converters & Units

Sample Size Calculator

Required sample size
respondents needed
Without finite-pop adjustment
z-score
Population assumed
Formula
Cochran
Set parameters to compute sample size
Advertisement
After results · AD-W1Responsive · Post-tool — peak engagement

How to use the Sample Size Calculator

Pick mode: Survey or A/B test

Survey mode: estimating an unknown population proportion (% who like brand X, % planning to vote Y). Uses Cochran\'s formula with optional finite-population correction. A/B test mode: detecting a difference between two conversion rates (5% baseline → 6% target). Uses power analysis for the z-test of proportions.

For survey: set confidence, MOE, expected proportion

95% confidence is standard (±2 SE rule). ±5% MOE is typical consumer-survey precision (≈ 384 respondents). Use p=0.5 if you have no prior estimate (worst-case max sample). For finite populations < 50,000, enter population size for adjustment.

For A/B test: set baseline + target + power

Baseline = current conversion rate. Target = what you\'re trying to detect (e.g., 6% if you want to detect a 1pp improvement on 5% baseline). 80% power is conventional minimum. α = 0.05 is conventional significance. Two-tailed for safety.

Read minimum n needed

The result is the MINIMUM. Aim for 10-20% above to account for non-response (surveys) or unexpected dropoff (A/B tests). For multi-arm tests (A/B/C/D), use per-arm sample size from the calculator times the number of arms.

Advertisement
After how-to · AD-W2Responsive

Sample size — getting it right before you run the test

Sample size is the single most-neglected pre-test decision in business research. Too few subjects + you\'ll miss real effects (Type II error / low power); too many + you waste time, money, or both. Proper sample-size calculation BEFORE running a survey or A/B test prevents both failure modes. The math is well-established — Cochran\'s formula for surveys (1953) + the z-test power analysis for proportions are the standard tools — but most product teams skip this step and run "until they see something." This calculator gives you the right number upfront so you know how long an A/B test must run or how many people you need to survey for the precision you want.

Survey sample size — Cochran\'s formula

n = z² × p(1-p) / e². z = confidence multiplier (1.96 for 95%); p = expected proportion (use 0.5 for max sample when unknown); e = margin of error (half-width of confidence interval). At 95% confidence + ±5% MOE + p=0.5: n = 1.96² × 0.5 × 0.5 / 0.05² = 384.16 → 385. Finite-population correction: when sampling > 5% of population, n_adjusted = n / (1 + (n-1)/N). For surveys of populations under ~50,000, this materially reduces required sample. Common targets: ±5% MOE at 95% confidence (n ≈ 384) for consumer surveys; ±3% MOE (n ≈ 1,067) for political polling; ±1% MOE (n ≈ 9,604) for census-grade precision.

At 95% confidence + ±5% MOE, you need 384 respondents to estimate any proportion. This holds for surveys of 1 million or 100 million people — the formula doesn't care about population size when n << N.

A/B test sample size — detecting lifts requires huge samples

A/B test sample size depends critically on the size of effect you want to detect. Sample size grows roughly with 1/lift². Detecting 5% relative lift from 10% baseline: ~30K per variant. Detecting 1% relative lift: ~750K per variant. The implication: small lifts require enormous samples. Most product changes produce small lifts (1-5%) — and small platforms simply don\'t have the daily traffic to detect them. Practical wisdom: only test changes large enough to actually matter (10%+ relative lift typically). For smaller-scale platforms (under 5K DAU), plan for weeks of test duration or test bigger-bet changes.

Common ASEAN-specific sample-size pitfalls

Three errors common in ASEAN business contexts: (1) Cross-country surveys treating regional sample as one: a 1,000-respondent survey covering SG/MY/PH/ID/TH/VN has effective sub-samples of ~167 per country — only ±7% MOE per country, not ±3% as reported. For per-country claims, plan ~400 per country (n=2,400 total). (2) Under-sized A/B tests in early-stage products: testing on 1-2K MAU products and declaring winners — typically you can only detect very large lifts (30%+). Realistic expectation: most A/B tests fail to reach significance because the product simply doesn\'t have enough traffic. (3) Online survey panels with poor demographic coverage: large platforms (Cint, Toluna, Lucid in ASEAN) have specific demographic skews. A "nationally representative" sample requires expensive quota sampling, not just a panel sample. Check panel quality before accepting estimates.

10 Things to Know About Sample Size

01

±5% MOE at 95% confidence: only need 384 respondents — regardless of whether population is 1M or 100M.

02

A/B test sample size ∝ 1/lift². Detecting half the lift requires 4× the sample.

03

Most product A/B tests are underpowered. Realistic effects (1-5% lift) need 100K+ users per variant.

04

80% power is conventional minimum. Below that, you frequently miss real effects.

05

Use p = 0.5 in survey sample-size formula for worst-case (maximum sample needed) when no prior estimate exists.

06

Finite-population correction matters when sampling > 5% of population. Reduces required sample.

07

Cross-country ASEAN surveys often under-sample per-country — n=1,000 across 6 countries = only ±7% MOE per country.

08

One-tailed tests need ~20% smaller sample than two-tailed, but require pre-registered direction.

09

Multi-arm A/B/C/D tests need full per-arm sample for each arm. Total = per-arm × n_arms.

10

Always add 10-20% buffer for non-response (surveys) or unexpected dropoff (A/B tests).

Frequently Asked Questions

  • Sample size scales with desired precision, not population size, when n << N. The Central Limit Theorem says the sampling distribution\'s standard error = SD / √n — depends only on sample n, not population N. Counterintuitive but mathematically correct. Finite-population correction only kicks in when you\'re sampling a meaningful fraction of the population (>5%).

  • Use p = 0.5. This gives the maximum sample size needed (variance p(1-p) is maximised at p=0.5). If your true p is 0.1 or 0.9, you\'d actually need fewer respondents — but sizing for worst case ensures you have enough regardless. Sample-size estimates done before you have data should default to p=0.5.

  • Because typical effect sizes are small. Most product changes produce 1-5% relative lifts, requiring tens to hundreds of thousands of users per variant. This is why large platforms (Grab, Shopee, Sea) can A/B test effectively while small startups (~1K DAU) often can\'t — their math demands far more users than the product has. Practical answer: small-scale products should test bigger-bet changes (substantial UX redesigns, new categories) where effects are large enough to detect with available users.

  • Statistical power = 1 − probability of Type II error (failing to detect a real effect). 80% power means: if the true effect IS what you\'re sizing for, you\'ll detect it 80% of the time + miss it 20%. 80% is conventional minimum (Cohen 1988). 90% is stricter; below 80% is considered underpowered. Higher power requires larger samples: 90% power needs ~30% more sample than 80% power. Most product teams default to 80%.

  • Two-tailed is the safe default. One-tailed: more powerful (smaller sample for same effect) BUT requires pre-registering the direction of effect before seeing data. If you set up the test as "test will improve conversion" and conversion actually decreases significantly, one-tailed would NOT detect this — it can only "see" improvements. Two-tailed catches both directions. Use one-tailed only when you genuinely care only about one direction (regulatory non-inferiority testing, etc.).

  • Plan per-arm sample size using the calculator (for the smallest difference you want to detect between any two arms). Then total sample = per-arm × number of arms. Plus multiple-comparison correction: with k arms, you have k(k-1)/2 pairwise comparisons. Adjust α (Bonferroni: α_adjusted = α / k(k-1)/2). Adjusted α means higher z-scores in the formula → larger per-arm sample. Practical rule: 4-arm test (A/B/C/D) ≈ 1.5× per-arm sample vs 2-arm; 6-arm ≈ 2×.

  • This tool sizes for binary outcomes (proportions, conversion rates). For continuous outcomes (revenue per user, time on page, etc.) use t-test power analysis with Cohen\'s d effect size. Rough rule: detecting d = 0.2 (small effect) needs ~394 per group at 80% power; d = 0.5 (medium) needs ~64; d = 0.8 (large) needs ~26. Most A/B tests on revenue have small effect sizes — need large samples. Use specialised software (R pwr package, G*Power, Python statsmodels) for precise continuous sample sizing.

  • Don\'t. Repeatedly checking results before reaching the calculated sample size dramatically inflates false-positive rates. If you peek 10 times during a test, your effective α can be 20-30% even though each "check" used α = 0.05. Solutions: (1) Commit to the calculated sample size + don\'t stop early; (2) Use sequential testing methods (group sequential designs, Bayesian methods) designed to handle peeking — but these need expert setup. Most A/B testing tools (Optimizely, VWO, etc.) include sequential testing options.

  • No. All calculations run in your browser via JavaScript. Open DevTools → Network and confirm zero outbound requests. Parameters stay on your device. Safe for confidential product planning.

  • Pair with: T-Test (RT-CNV-090) for continuous A/B; Chi-Square (RT-CNV-091) for testing conversion-rate differences post-test; ANOVA (RT-CNV-092) for multi-arm continuous; Confidence Interval (RT-CNV-083) for interval estimation. External: G*Power (free, gold-standard sample-size tool); R pwr package; Python statsmodels; Optimizely + VWO built-in sample-size calculators for A/B testing.

Related News

You may be interested in these recent stories from our newsroom.

No related news yet for this tool. Our editorial team publishes new pieces every week.

Browse all news →
Advertisement
Pre-footer · AD-W3 728 × 90

75 more free tools

Calculators, converters, security tools — no signup.