Z-Score Calculator

By RECATOOLS Editorial Updated 11 Jun 2026

About this tool

Standard score + percentile + p-value

Convert any value to its z-score (standard score) given the mean and standard deviation. Outputs percentile, one-tailed + two-tailed p-values, and significance verdict.

RT-CNV-082 · Converters & Units

Raw value (x)

Population mean (μ)

Standard deviation (σ)

Z-Score (Standard Score)

—

z = (x − μ) / σ — number of standard deviations from the mean

📈 Percentile

—

P-value (one-tailed)

—

P-value (two-tailed)

—

Significance

—

Enter raw value (x), mean (μ), and standard deviation (σ > 0)

How to use the Z-Score Calculator

Enter your three values: x, μ, σ

x = the raw value you want to standardise (e.g. a test score of 85, a height of 175 cm, a stock return of 8%). μ (mu) = the population mean (e.g. average test score 70, average height 170 cm, average return 5%). σ (sigma) = the population standard deviation (e.g. SD of test scores = 10, SD of heights = 7 cm, SD of returns = 15%). The calculator instantly computes z = (x − μ) / σ.

Interpret the z-score

The z-score is the number of standard deviations the value is away from the mean. z = 0 means right at the mean. z = +1 means one SD above the mean. z = −2 means two SDs below the mean. The 68-95-99.7 rule: ~68% of normal data has |z| ≤ 1, ~95% has |z| ≤ 2, ~99.7% has |z| ≤ 3. So a z-score of +2.5 is in the top 1% of typical data; z = +3 is in the top 0.15%.

Read the percentile

Percentile = the percentage of values that fall BELOW your data point in a normal distribution. z = 0 → 50th percentile (median). z = +1 → 84th percentile. z = +2 → 97.7th percentile. z = −1 → 16th percentile. Useful for things like: "this child's height is in the 75th percentile" or "this test score is in the top 5%". The percentile assumes the underlying distribution is approximately normal — if your data is heavily skewed, percentiles from z-scores will be inaccurate.

Use p-values for hypothesis testing

The p-value is the probability of observing a value at least this extreme under the null hypothesis. Two-tailed p < 0.05: statistically significant at the 5% level (a common cutoff). p < 0.01: very significant. p < 0.001: highly significant. The two-tailed p-value tests whether the value is significantly DIFFERENT from the mean (either direction); one-tailed tests whether it's significantly GREATER (or LESS) than the mean. Use two-tailed unless you have a strong directional hypothesis stated before looking at the data.

Z-scores — the universal currency of statistics

The z-score (also called standard score) is one of the most useful single concepts in statistics. It converts any value from any normally-distributed dataset onto a universal scale: "how many standard deviations from the mean?" This standardisation lets you compare apples and oranges — a student in the 90th percentile on the SAT compared to a 90th-percentile applicant in a national fitness test, despite the underlying scores being completely different scales. The math is simple: z = (x − μ) / σ. The interpretation is universal: positive z = above mean, negative z = below mean, magnitude = how far in units of standard deviation.

From z-score to percentile and back

The standard normal distribution (mean 0, SD 1) has a well-known cumulative distribution function (CDF) Φ(z) that maps any z-score to its corresponding percentile. Key values: Φ(0) = 0.50 (50th percentile, the median). Φ(1) ≈ 0.84 (84th percentile). Φ(2) ≈ 0.977 (97.7th percentile). Φ(3) ≈ 0.9987 (99.87th percentile). The inverse direction is equally useful: to find the value at the 95th percentile, you need z = 1.645 (one-tailed) or z = 1.96 (two-tailed). These two values — 1.96 and 1.645 — are the most-cited z-scores in statistical practice because they correspond to the 5% significance threshold for two-tailed and one-tailed tests respectively. Memorising them pays off across every quantitative discipline.

Z = 1.96 (two-tailed) and Z = 1.645 (one-tailed) are the most-cited z-scores in statistics. They correspond to the 5% significance threshold — the universal "p < 0.05" cutoff.

P-values: useful but often misunderstood

The p-value is the probability of observing data at least this extreme IF the null hypothesis is true. A common misinterpretation is that p < 0.05 means "5% chance of being wrong" — this is incorrect. P-values say nothing about the probability of the null hypothesis being true; they say only how surprising the observed data would be assuming H₀. Sound usage: a small p-value is evidence AGAINST the null hypothesis but doesn't quantify how much. The American Statistical Association issued a 2016 statement explicitly warning against the mechanical use of p < 0.05 as a decision rule. Effect sizes (standardised mean differences like Cohen's d) and confidence intervals are now considered more informative for practical interpretation. But the z-score → p-value path remains the foundation of frequentist statistical testing.

The ASEAN data-science + statistics angle

Statistical literacy across ASEAN has surged with the rise of data-science education. Z-scores show up routinely in: A/B testing for tech companies (Grab, Shopee, Lazada, GoTo, Tokopedia all run thousands of A/B tests monthly; z-tests determine which variants ship); medical research (clinical trials at NUS, NUHS, SingHealth, KKH publish in Lancet / NEJM using standard frequentist inference); fintech credit scoring (Sea Group, GXS Bank, OCBC, DBS use z-scores in risk models); educational psychometrics (SAT-equivalent national exams across ASEAN report z-scores or percentiles); quality control in manufacturing (Singapore's electronics + biopharma SPC dashboards). For the average APAC professional moving into a data role, mastering z-scores + standard normal distribution is the first chapter of any inferential statistics course. This calculator handles the math; understanding when and how to apply z-tests vs t-tests vs other tests is the lifelong skill that builds on top of it.

10 Things to Know About Z-Scores

Z-score formula: z = (x − μ) / σ. Number of standard deviations the value x is from the mean μ.

The standard normal distribution has mean = 0 and SD = 1. Z-scores convert any normally-distributed data onto this universal scale.

Z = 1.96 (two-tailed) and Z = 1.645 (one-tailed) correspond to the 5% significance threshold (p < 0.05).

The 68-95-99.7 rule: ~68% of normal data has |z| ≤ 1; ~95% has |z| ≤ 2; ~99.7% has |z| ≤ 3.

The standard normal CDF Φ(z) maps z-scores to percentiles. Φ(0) = 0.50, Φ(1) ≈ 0.84, Φ(2) ≈ 0.977, Φ(3) ≈ 0.999.

Z-scores assume the underlying distribution is roughly normal (bell-shaped). For heavily skewed data, percentiles from z-scores are inaccurate.

The p-value is the probability of data at least this extreme under the null hypothesis. NOT the probability the null hypothesis is true.

Use a two-tailed test by default; one-tailed only when you have a directional hypothesis stated BEFORE looking at data.

The "6 Sigma" methodology targets defects ≥ 6 SD from spec — p < 1×10⁻⁹ (essentially zero). 3.4 defects per million opportunities.

The 2016 ASA statement on p-values explicitly warns against mechanical use of p < 0.05 as a decision rule. Always report effect sizes alongside p-values.

Frequently Asked Questions

A z-score (or "standard score") is the number of standard deviations a value is away from the mean of its distribution. Formula: z = (x − μ) / σ. Positive z = above the mean; negative z = below the mean. Magnitude tells you how far. The standardisation lets you compare values from different distributions on a common scale — a z-score of +2 means "2 SD above the mean" whether you're measuring test scores, height, weight, stock returns, or anything else. It's the lingua franca of inferential statistics.
Depends entirely on context. For test scores: higher z-score = better. For pollutant levels: lower z-score = better. For health markers (cholesterol, blood pressure): z-scores near zero (closer to typical) are usually best. Magnitude interpretation: |z| < 1: well within typical range (68% of normal data). |z| = 1-2: moderately unusual. |z| = 2-3: rare. |z| > 3: very rare (potential outlier or strong signal). For hypothesis testing: |z| > 1.96 = significant at p < 0.05 (two-tailed); |z| > 2.58 = significant at p < 0.01.
Use TWO-tailed by default. Use one-tailed only when you have a strong directional hypothesis stated BEFORE looking at the data — "this drug LOWERS blood pressure" (not "changes" it). Two-tailed tests whether your data is significantly different from the mean in EITHER direction (both higher and lower count as evidence). One-tailed tests only one direction. The one-tailed p-value is exactly half the two-tailed p-value, so one-tailed reaches significance "more easily" — this is why analysts sometimes incorrectly use one-tailed post-hoc to fish for significance. Pre-registering your hypothesis direction is the only legitimate way to use one-tailed.
p < 0.05 means: if the null hypothesis were true, the probability of observing data at least this extreme would be less than 5%. It does NOT mean: there's a 95% chance the alternative hypothesis is true (a common misinterpretation). It does NOT mean: the effect is large or practically important (small effects can be highly significant with large samples). It does NOT mean: the result will replicate (replicability requires effect size + sample size context). The 2016 American Statistical Association statement explicitly warned against mechanical interpretation. Always report effect sizes (Cohen's d, % difference, etc.) alongside p-values.
Because for the standard normal distribution, Φ(1.96) = 0.975 (97.5th percentile). The remaining 2.5% in each tail sums to 5% total — exactly the 5% significance level for a two-tailed test. So if your |z| > 1.96, you're in the most extreme 5% of values expected under the null hypothesis. The corresponding one-tailed value is z = 1.645 (Φ(1.645) = 0.95). These numbers are worth memorising — they show up across every quantitative discipline. For more stringent tests: z = 2.58 → p < 0.01 (two-tailed); z = 3.29 → p < 0.001.
Use a z-test when you KNOW the population standard deviation σ. Use a t-test when you ESTIMATE the SD from your sample (the more common case). The t-distribution has wider tails than the normal distribution to account for the extra uncertainty from estimating SD; the difference is larger for small samples. As n grows, the t-distribution converges to the normal distribution — for n > 30, they're virtually identical. For small samples (n < 30), always use t-test. For large samples (n > 30) with unknown σ, either works (t-test is technically correct but z-test is a good approximation). When σ is genuinely known (rare in practice), use z-test.
Z-score percentile interpretations require normal-ish data. For heavily skewed data (income, response times, ecological measurements), z-scores still calculate but percentiles will be wrong. Options: (1) Transform the data first (log, square root, Box-Cox) to make it more normal. (2) Use non-parametric tests (Mann-Whitney U, Wilcoxon) that don't assume normality. (3) Compute percentiles directly from the data using empirical ranks (this calculator's std-dev companion tool does this). The Central Limit Theorem helps for large samples — even non-normal data has approximately normal SAMPLE MEANS when n > 30, so z-tests on the mean are usually valid even when individual values aren't normal.
Within ±0.00001 for typical z-scores, using the Abramowitz-Stegun rational approximation of the standard normal CDF. This is the same algorithm used in Excel's NORM.S.DIST() function and most statistical software. For extreme z-scores (|z| > 6), the precision degrades slightly but the p-value is essentially zero anyway. For research-grade precision (publication-quality, when |z| > 5 matters), use R / Python / SAS which have higher-precision implementations. For 99%+ of practical work, this approximation is more than adequate.
No. All calculations run entirely in your browser via JavaScript. There's no server roundtrip — open DevTools → Network and confirm zero outbound requests. Your data stays on your device. Safe for clinical trial analyses, proprietary A/B test results, sensitive research data, or any inferential statistics work that shouldn't leave your machine.
Everywhere quantitative. A/B testing: z-tests determine which variant is statistically significant (Shopee, Lazada, Grab, GoTo all run thousands of A/B tests monthly using z or t-tests). Quality control: SPC charts in manufacturing flag readings outside ±3σ. Medical research: clinical trial endpoints use z-tests for normally-distributed outcomes. Finance: Sharpe ratio is essentially a z-score (excess return / SD). Education: SAT, GRE, GMAT scores are normalised to specific mean/SD; percentile reports come from z-scores. Psychology: T-scores (z scaled to mean 50, SD 10) used in personality + IQ testing. The math in this calculator is the foundation of all of these.

Related News

You may be interested in these recent stories from our newsroom.

No related news yet for this tool. Our editorial team publishes new pieces every week.

Browse all news →

Z-Score Calculator