Z-Score Calculator

Proportion Z-Test Calculator

Based on the standard normal distribution (μ = 0, σ = 1). Tests H₀: p = p₀ (one-proportion) or H₀: p₁ = p₂ (two-proportion).

Test Type

Compare a single sample proportion p̂ = x/n to a hypothesized p₀.

Tail

H₁: p ≠ p₀ (one-proportion) or p₁ ≠ p₂ (two-proportion).

Significance Level (α)
Sample

Solution

Share:

Worked Examples

One-Proportion · Two-Tailed

Election poll: x = 520 of n = 1,000 vs. p₀ = 0.5

A polling firm samples 1,000 voters and finds 520 in favor. Test whether true support differs from 50% at α = 0.05.

  1. State H₀: p = 0.5; H₁: p ≠ 0.5 (two-tailed).
  2. Sample proportion: p̂ = 520/1000 = 0.52.
  3. Standard error under H₀: √(0.5 × 0.5 / 1000) ≈ 0.01581.
  4. z = (0.52 − 0.5) / 0.01581 ≈ 1.2649.
  5. Two-tailed p-value: 2 × (1 − Φ(1.2649)) ≈ 0.2059.
  6. Critical values at α = 0.05: ±1.96.
  7. Since |1.2649| < 1.96 and p > 0.05, fail to reject H₀.

The sample share of 52% looks suggestive, but with n = 1,000 the margin of error is roughly ±3 points and 50% is comfortably inside that band. Always pair the p-value with a confidence interval around p̂.

Two-Proportion · Two-Tailed

A/B test: 80/200 vs. 60/200 conversions

Variant A converts 80 of 200 visitors; variant B converts 60 of 200. Test whether the conversion rates differ at α = 0.05.

  1. State H₀: p₁ = p₂; H₁: p₁ ≠ p₂ (two-tailed).
  2. Sample proportions: p̂₁ = 0.40, p̂₂ = 0.30.
  3. Pooled estimate: p̂_pooled = (80 + 60) / (200 + 200) = 0.35.
  4. Standard error: √(0.35 × 0.65 × (1/200 + 1/200)) ≈ 0.04770.
  5. z = (0.40 − 0.30) / 0.04770 ≈ 2.0966.
  6. Two-tailed p-value: 2 × (1 − Φ(2.0966)) ≈ 0.0360.
  7. Critical values at α = 0.05: ±1.96. Since |2.0966| > 1.96 and p < 0.05, reject H₀.

The 10-point lift is statistically significant at α = 0.05, but check whether the absolute difference and the resulting confidence interval are practically meaningful for your business case before shipping the variant.

One-Proportion · Right-Tailed

Defect rate: x = 60 of n = 500 vs. p₀ = 0.10

A factory inspects 500 units and finds 60 defective. Test whether the true defect rate exceeds the 10% spec at α = 0.05.

  1. State H₀: p = 0.10; H₁: p > 0.10 (right-tailed).
  2. Sample proportion: p̂ = 60/500 = 0.12.
  3. Standard error under H₀: √(0.10 × 0.90 / 500) ≈ 0.01342.
  4. z = (0.12 − 0.10) / 0.01342 ≈ 1.4907.
  5. Right-tailed p-value: 1 − Φ(1.4907) ≈ 0.0680.
  6. Critical value at α = 0.05: 1.6449.
  7. Since 1.4907 < 1.6449 and p > 0.05, fail to reject H₀.

Borderline result — at the more lenient α = 0.10 the test would reject. With binary inspection data, also consider the cost of a false alarm (over-reacting to noise) vs. a missed signal (a real spec drift) before treating α = 0.05 as the only threshold.

One-Proportion Z-Test

Tests whether a single sample proportion p̂ = x/n differs from a hypothesized value p₀. The standard error uses p₀ rather than p̂ because under H₀ the true proportion is p₀ — this is what makes it a hypothesis test rather than a confidence interval.

z = (p̂ − p₀) / √(p₀(1 − p₀)/n)

Two-Proportion Z-Test (Pooled)

Tests whether two independent sample proportions differ. Uses the pooled estimate p̂_pooled = (x₁ + x₂)/(n₁ + n₂) under H₀: p₁ = p₂ — pooling produces a more powerful test than the unpooled (Wald) form when the null is true.

z = (p̂₁ − p̂₂) / √(p̂_pooled (1 − p̂_pooled)(1/n₁ + 1/n₂))

How It Works

A proportion z-test asks whether observed success rates differ from a benchmark or from each other by more than chance would predict. The one-proportion test compares a single sample's success rate p̂ = x/n to a hypothesized value p₀ — useful for checking whether a poll deviates from a known baseline or whether a defect rate exceeds spec. The two-proportion test compares two independent samples' success rates p̂₁ and p̂₂ — useful for A/B tests, comparing conversion rates across segments, or comparing response rates between treatment and control. Both forms compute a z-statistic, convert it to a p-value via the standard normal distribution, and compare against your chosen significance level α to make a reject / fail-to-reject decision against H₀. Both rely on the normal approximation, which works well when np ≥ 10 and n(1 − p) ≥ 10 in each group.

Example Problem

A polling firm samples 1,000 voters and finds 520 in favor of a candidate. Test whether the true level of support differs from 50% at α = 0.05 (two-tailed).

  1. State H₀: p = 0.5 and H₁: p ≠ 0.5. The two-tailed alternative makes no directional claim.
  2. Compute the sample proportion: p̂ = 520/1000 = 0.52.
  3. Compute the standard error under H₀: √(0.5 × 0.5 / 1000) = √0.00025 ≈ 0.01581.
  4. Compute the z-statistic: z = (0.52 − 0.5) / 0.01581 ≈ 1.2649.
  5. Find the two-tailed p-value: p = 2 × (1 − Φ(|1.2649|)) ≈ 0.2059.
  6. Find the critical values at α = 0.05: ±z_{0.975} = ±1.96.
  7. Compare: |1.2649| < 1.96 and p ≈ 0.2059 > 0.05, so we fail to reject H₀.
  8. Conclusion: at α = 0.05 the data do not provide enough evidence to conclude that the true support differs from 50%.

A 52% sample share looks suggestive, but with n = 1000 the margin of error around p̂ is about ±3.1 points — 50% is comfortably inside that band. Reporting the sample proportion alongside a 95% confidence interval gives more context than the p-value alone.

Key Concepts

The proportion z-test rests on the central limit theorem: for large n, the sampling distribution of p̂ is approximately normal with mean p and variance p(1 − p)/n. The one-proportion test substitutes the hypothesized p₀ into the variance formula because under H₀ the true proportion is p₀; the two-proportion test pools the two samples to estimate the common proportion under H₀: p₁ = p₂. The normal approximation requires reasonably large samples — a common rule of thumb is np ≥ 10 and n(1 − p) ≥ 10 in each group. For very small counts or when p is near 0 or 1, switch to an exact test (binomial test for one sample, Fisher's exact for two). For confidence intervals around p̂ rather than hypothesis tests, use the unpooled Wald form sqrt(p̂(1 − p̂)/n) or the more accurate Wilson interval — the test-statistic SE and the CI SE are not the same.

Applications

  • A/B testing — comparing conversion rates between two website variants
  • Polling and market research — checking whether observed support differs from a benchmark or between groups
  • Quality control — testing whether a defect rate exceeds a tolerance threshold (one-proportion) or differs between suppliers (two-proportion)
  • Clinical research — comparing response or remission rates between treatment and control arms
  • Education — testing whether pass rates differ from a target, or between two cohorts or schools
  • Manufacturing — checking whether yield rates between two production lines differ significantly

Common Mistakes

  • Using the proportion z-test with very small samples — the normal approximation breaks down; use a binomial or Fisher's exact test instead
  • Forgetting that the test SE under H₀ uses p₀ (one-proportion) or pooled p̂ (two-proportion) — the unpooled Wald form is for confidence intervals, not for testing equality
  • Choosing the tail direction after seeing the data — pre-specify it from H₁
  • Treating paired binary data (before/after on the same subjects) as independent — use McNemar's test instead
  • Reporting only the p-value without the sample proportions, the difference, or a confidence interval
  • Assuming a non-significant result proves p₁ = p₂ — absence of evidence is not evidence of absence; consider the power and the effect size

Frequently Asked Questions

What is a proportion z-test?

A proportion z-test compares observed success rates against a hypothesized value (one-proportion) or against each other (two-proportion). It uses the central-limit-theorem normal approximation to the binomial: for large enough samples, the distribution of p̂ = x/n is approximately normal, so a z-statistic and standard normal p-value can be computed.

When should I use the one-proportion vs. two-proportion test?

Use one-proportion when you're comparing a single sample's success rate p̂ to a fixed hypothesized value p₀ — for example, testing whether a defect rate exceeds 5% or whether a poll deviates from 50%. Use two-proportion when comparing two independent groups' success rates against each other — for example, A/B testing or comparing two clinical arms.

Why does the test use p₀ in the standard error instead of p̂?

Under H₀ the true proportion is p₀, so the variance of p̂ is p₀(1 − p₀)/n. Substituting p₀ keeps the test statistic's distribution under H₀ centered at zero with unit variance. Confidence intervals around p̂ instead use the Wald form sqrt(p̂(1 − p̂)/n) — those serve a different purpose (estimating the unknown p) and don't assume H₀.

Why does the two-proportion test pool the proportions?

Under H₀: p₁ = p₂ both groups share a common proportion, so the most efficient estimate is the pooled p̂_pooled = (x₁ + x₂)/(n₁ + n₂). Pooling produces a more powerful test than the unpooled (Wald) form when H₀ is true. Confidence intervals for the difference instead use the unpooled form, since CIs do not assume the proportions are equal.

What sample size do I need?

A common rule of thumb is np ≥ 10 and n(1 − p) ≥ 10 in each group, where p is the relevant hypothesized or pooled proportion. For smaller samples or proportions near 0 or 1, use an exact test (binomial test for one sample, Fisher's exact for two) — the normal approximation gets unreliable in those regimes.

Can I use this for paired data like before/after on the same subjects?

No. The two-proportion z-test assumes the two samples are independent. For paired binary data — same subjects measured before and after, or matched-pair experimental designs — use McNemar's test on the discordant pairs instead. Applying the independent two-proportion test to paired data wastes power and can be misleading.

What does the p-value tell me?

The p-value is the probability of observing a test statistic at least as extreme as yours, assuming H₀ is true. Smaller p-values indicate stronger evidence against H₀: p = p₀ (one-proportion) or p₁ = p₂ (two-proportion). A p-value below your chosen α leads to rejecting H₀, but the p-value does not measure the size or practical importance of the effect.

How do I choose the tail direction?

Choose left-tailed when H₁ predicts the proportion is below the benchmark or below the comparison group, right-tailed when above, and two-tailed when no direction is specified. Decide before you look at the data — picking the tail post hoc inflates the false-positive rate.

Reference: The one-proportion z-test computes z = (p̂ − p₀) / √(p₀(1 − p₀)/n) and converts to a p-value using the standard normal cumulative distribution function via the Abramowitz and Stegun rational approximation. The two-proportion test uses the pooled standard error √(p̂_pooled(1 − p̂_pooled)(1/n₁ + 1/n₂)) under H₀: p₁ = p₂. Critical values are produced from the inverse normal CDF (Acklam's rational approximation) at the chosen significance level α. Both tests rely on the normal approximation to the binomial, which holds when np ≥ 10 and n(1 − p) ≥ 10 in each group.

Related Calculators

Related Sites