Compute confidence intervals for sample mean (z and t distributions) and proportions (normal approximation + Wilson score). 80, 90, 95, 99, 99.9% confidence levels.

RT-CNV-083 · Converters & Units

Confidence Interval Calculator

📈 Confidence Interval for the Mean (continuous data)

Confidence Interval for μ
[, ]
Margin of error
Standard error
Critical value
Enter sample mean, SD, and n ≥ 2

📊 Confidence Interval for a Proportion (binary outcome data)

Wilson Score CI (preferred)
[, ]
Normal CI low
Normal CI high
Margin of error
Enter successes (≤ trials) and trials (≥ 1)
Advertisement
After results · AD-W1Responsive · Post-tool — peak engagement

How to use the Confidence Interval Calculator

Pick the right mode: mean or proportion

Mean CI: for continuous data — test scores, response times, prices, measurements, returns. You need three numbers: sample mean (x̄), sample standard deviation (s), and sample size (n). Proportion CI: for binary outcomes — yes/no, pass/fail, click/no-click, vote share. You need two numbers: successes (how many "yes" outcomes) and trials (total count). Most A/B test results, election polls, and survey proportions use this mode.

Choose your confidence level

The default 95% is the universal standard in scientific publishing and political polling. 90% is sometimes used in exploratory analysis. 99% is for high-stakes decisions (medical, regulatory). The trade-off: higher confidence = wider interval. A 99% CI is ~1.3× wider than a 95% CI for the same data. Reporting both the interval AND the confidence level is essential — "the mean is between 75 and 85" is meaningless without saying at what confidence.

For means: use t-distribution if n < 30

The "Use t-distribution" checkbox is on by default — it's the safer choice. The t-distribution accounts for the extra uncertainty of estimating the population SD from a sample. For small samples (n < 30), t-intervals are wider than z-intervals, reflecting the additional uncertainty. For large samples (n > 30), t and z give virtually identical results. The only reason to UNCHECK is if you genuinely know the population SD σ (rare — most real-world analyses estimate σ from the sample).

For proportions: prefer Wilson score over normal approximation

The proportion CI has two formulas. The normal approximation p̂ ± z×√(p̂q̂/n) is the textbook formula but performs poorly when p̂ is close to 0 or 1, or when n is small (under 30). The Wilson score interval is more accurate across all conditions and is the modern recommendation (Brown, Cai, and DasGupta 2001). Use Wilson values; normal approximation is shown for comparison and historical context. They agree closely for large samples and moderate proportions.

Advertisement
After how-to · AD-W2Responsive

Confidence intervals — what they say and what they don't

A confidence interval is a range of values that's likely to contain the true population parameter (mean, proportion, etc.) you're trying to estimate. The "95% confidence" doesn't mean there's a 95% probability the parameter is in this specific interval (a common misinterpretation) — it means that if you repeated your sampling procedure many times, ~95% of the resulting intervals would contain the true parameter. The distinction is subtle but important: the parameter is fixed (it's not random), while the interval moves around with each sample. The strict interpretation is hard to internalise; the practical use is simpler: a 95% CI gives you a range you can reasonably trust as plausible values for the parameter.

The margin of error and sample size relationship

The width of a confidence interval is governed by three factors: the variability in the data (SD or proportion magnitude), the sample size, and the chosen confidence level. The most important relationship: margin of error scales with 1/√n. To halve the margin of error, you need 4× the sample size. To quarter it, you need 16×. This is why political polls report ±3% with n≈1000 (1/√1000 ≈ 0.032, and z×SE for proportions is roughly 1.96 × 1/√n × 0.5 ≈ 0.031 in the worst case). Bigger samples give tighter intervals but with rapidly diminishing returns — going from 1000 to 4000 only halves the margin, at 4× the cost. This trade-off underlies sample size planning across every empirical discipline.

Margin of error scales with 1/√n. To halve the precision, quadruple the sample. To get 1% margins, you need ~10,000 samples — which is why high-precision polling is expensive.

Why Wilson score beats normal approximation for proportions

The textbook proportion CI uses the normal approximation: p̂ ± z×√(p̂(1−p̂)/n). It's simple to compute but performs poorly in two situations: (1) when p̂ is close to 0 or 1 (e.g. 9 successes in 100 trials), the normal approximation can produce intervals that include negative values or values above 1 — which are impossible for proportions. (2) For small samples (n < 30) or when the success count is small (< 5), the actual coverage probability of the normal CI drops below the nominal level (95% nominal might be 89% actual coverage). The Wilson score interval (Wilson 1927) handles both cases naturally — it's constrained to [0, 1] and has better coverage across all conditions. Brown, Cai, and DasGupta's 2001 paper "Interval Estimation for a Binomial Proportion" made Wilson the modern default; the calculator follows that recommendation.

The ASEAN polling + A/B test angle

Confidence intervals show up everywhere in ASEAN quantitative work. Election polling: Singapore (Mediacorp polls before GE), Malaysia (Merdeka Center pre-election polls), Indonesia (Indikator Politik), Philippines (Pulse Asia, SWS), Thailand (NIDA Poll), Vietnam (none publicly — restricted by state). Standard methodology: 1500-3000 respondents, ±2-3% margin of error at 95% confidence. Tech-company A/B tests: Grab, Shopee, Lazada, GoTo, Tokopedia, Sea Group all use confidence intervals around conversion rate differences to determine which test variants ship. Typical thresholds: 95% CI excluding zero = "significant", ship; 95% CI straddling zero = inconclusive, gather more data. Clinical trials: SingHealth + NUH + Mahidol + UI medical schools all use 95% CI for primary endpoints in published trials. Customer satisfaction (NPS, CSAT): small samples often produce wide CIs, leading product teams astray — Wilson score intervals are essential here. The math in this calculator handles all of these correctly; the interpretive nuance is what separates a thoughtful analyst from a naive one.

10 Things to Know About Confidence Intervals

01

95% confidence interval doesn't mean "95% chance the parameter is in this interval" — it means 95% of CIs from repeated samples would contain it.

02

Margin of error scales with 1/√n. To halve precision, quadruple sample size. Big polls cost a lot to be slightly more precise.

03

For sample mean: use the t-distribution when n < 30 or σ unknown. Use z only when σ is genuinely known.

04

For proportions: the Wilson score interval beats the normal approximation, especially for small samples or extreme p̂.

05

The most common CI level in scientific publishing: 95%. In medical literature: increasingly 99%. Exploratory: sometimes 90%.

06

Election polls typically report ±3% margin at 95% CI from n≈1000. Going to ±1% margin requires n≈9,600.

07

A CI excluding zero (for a difference) is equivalent to a statistically significant result at the corresponding p-value cutoff.

08

Width of 99% CI ≈ 1.3× width of 95% CI for the same data. 99.9% CI is ≈1.7× wider. Higher confidence costs precision.

09

Bayesian credible intervals are conceptually different — they DO express probability about the parameter, requiring a prior distribution.

10

For very small samples (n < 5) or extreme proportions, use exact methods (Clopper-Pearson) instead of normal or Wilson approximation.

Frequently Asked Questions

  • Technically: if you repeated your sampling procedure many times and computed a 95% CI each time, approximately 95% of those intervals would contain the true population parameter. The interval moves; the parameter is fixed. It does NOT mean "there's a 95% probability the parameter is in THIS specific interval" — that's a Bayesian interpretation requiring a prior. In practice, the strict interpretation is hard to internalise and most people use the looser interpretation. The looser interpretation is approximately correct for non-pathological priors, but the strict interpretation is what frequentist statistics actually claims.

  • For proportions at 95% CI: n ≈ (1.96/margin)² × p̂(1-p̂). Worst case is p̂ = 0.5, which maximises p̂(1-p̂) at 0.25. So: ±5% margin = n ≈ 384. ±3% = n ≈ 1067. ±2% = n ≈ 2401. ±1% = n ≈ 9604. For means at 95% CI: n ≈ (1.96 × SD / margin)². The 1/√n scaling means halving the margin quadruples the required n. Most election polls use n ≈ 1000-2000 for ±2-3% — a sweet spot of precision vs cost.

  • Use t when you ESTIMATE the population SD from your sample (the usual case). Use z only when you KNOW the population SD exactly (rare). For large n (> 30), t and z give nearly identical critical values; for small n, t is wider (reflects extra uncertainty). The "Use t-distribution" checkbox defaults to ON because it's the safer choice. Even for n=100, the difference between t and z is < 0.5% — negligible. Only use z if you have a textbook problem that explicitly says "assume σ = 10" or similar.

  • The Normal approximation p̂ ± z×√(p̂(1−p̂)/n) is mathematically simple but performs poorly when p̂ is near 0 or 1, or when n is small. The Wilson score interval (introduced by Wilson in 1927, popularised after 2001) uses a different derivation that constrains the interval to [0, 1] and has better coverage probability. For large samples (n > 100) and moderate proportions (0.2 < p̂ < 0.8), the two intervals are virtually identical. For small samples or extreme proportions, Wilson is significantly more accurate. Modern statistical software defaults to Wilson; old textbooks still use Normal.

  • Two-sided (the default) by far. Two-sided gives both upper and lower bounds, treating "too high" and "too low" symmetrically. One-sided gives only one bound and is useful in specific contexts (e.g. "minimum effective dose" trials only care about the lower bound; "maximum tolerable dose" trials only care about the upper bound). One-sided 95% CIs are essentially the same as two-sided 90% CIs in width. Use two-sided unless you have a strong directional hypothesis stated before collecting data.

  • Direct: a 95% CI excludes value X if and only if the two-tailed p-value for testing "parameter = X" is less than 0.05. A 99% CI excluding X = p < 0.01. So CIs and p-values carry the same information — but CIs are arguably more useful because they show: the estimate, the direction, the precision, AND statistical significance. The American Statistical Association's 2016 statement recommends preferring CIs over p-values when possible. Most journal style guides now require CIs alongside p-values.

  • Z and t intervals assume the sampling distribution of the mean is approximately normal. The Central Limit Theorem says this holds for sample means when n > 30, even for non-normal data. For very heavily skewed data (income, response times), n > 100 might be needed. Below that, use: (1) bootstrap CI (sample with replacement, compute CI of the bootstrap distribution), (2) log-transform the data first, (3) non-parametric methods (Wilcoxon signed-rank CI). For most practical use cases with n > 30 and moderate skew, the z/t CI is accurate enough.

  • A frequentist confidence interval is a range computed from data that has a coverage property in repeated sampling (95% of intervals contain the truth, on average). A Bayesian credible interval IS a probability statement about the parameter: "given the data + prior, there's a 95% probability the parameter is in this interval." Credible intervals require specifying a prior distribution; CIs don't. For non-informative priors and large samples, the two intervals agree numerically but mean different things. Bayesian intervals are more intuitive but more controversial (the prior is subjective). This calculator computes frequentist CIs only.

  • No. All calculations run entirely in your browser via JavaScript. There's no server roundtrip — open DevTools → Network and confirm zero outbound requests. Your sample data stays on your device. Safe for clinical trial CIs, proprietary survey results, A/B test conversion rates, or any inferential statistics work that shouldn't leave your machine.

  • Standard error of the mean = SD / √n. Standard error of a proportion = √(p̂(1−p̂)/n). Both scale with 1/√n — bigger n = smaller SE = tighter CI. The relationship is non-linear: doubling n only reduces SE by ~30% (√2 ≈ 1.41); quadrupling n cuts SE in half. This is why political polling sits at n=1000-2000 (±2-3% margin) — going larger gives diminishing returns. Census-level precision (±0.1%) requires n > 1,000,000, which is why government statistics from full census data are so much more precise than survey data.

Related News

You may be interested in these recent stories from our newsroom.

No related news yet for this tool. Our editorial team publishes new pieces every week.

Browse all news →
Advertisement
Pre-footer · AD-W3 728 × 90

75 more free tools

Calculators, converters, security tools — no signup.