"> P-Value: How to Calculate & Interpret It | Guide
Home > Library > Statistics > P-Value: How to Calculate and Interpret It

Published by at August 31st, 2021 , Revised On June 16, 2026

A p-value is the probability of obtaining results at least as extreme as those actually observed in your study, assuming the null hypothesis is true. In plain terms, it answers one question: if there were really no effect, how surprising would your data be? A small p-value means your data would be unusual under the null hypothesis, which counts as evidence against it. Crucially, the p-value is not the probability that the null hypothesis is true, and it is not the probability that your result happened by chance.

You will meet the p-value constantly in hypothesis testing, and it sits at the heart of statistical significance. This guide explains exactly what a p-value is, how to calculate and interpret it, how it relates to the significance level (alpha), the common mistakes to avoid, and a fully worked example you can follow step by step. Accuracy matters here, so every definition below is stated carefully.

“A p-value, or statistical significance, does not measure the size of an effect or the importance of a result… a p-value near 0.05 taken by itself offers only weak evidence against the null hypothesis.” — American Statistical Association, “Statement on Statistical Significance and P-Values” (2016)

What Are the Null Hypothesis and Alternative Hypothesis?

Before defining the p-value precisely, you need two ideas: the null hypothesis and the alternative hypothesis.

  • The null hypothesis (H0) states that there is no effect or no relationship between the variables you are studying. It is the “nothing is going on” position.
  • The alternative hypothesis (H1 or Ha) states that there is an effect or relationship. It is the claim the researcher usually hopes to support.

For example, suppose you want to compare two fertilisers. Group A of plants receives fertiliser A and Group B receives fertiliser B. Using a two-tailed t-test, you can assess whether the two fertilisers differ in their effect on growth.

Null hypothesis (H0): There is no difference in growth between the two groups.

Alternative hypothesis (H1): There is a difference in growth between the two groups.

The p-value never tells you whether H0 is true. It only quantifies how compatible your data are with H0.

What Is the P-value?

The p-value is the probability of obtaining a test result at least as extreme as the one observed, calculated under the assumption that the null hypothesis is true. “At least as extreme” is measured by the test statistic: the more your sample deviates from what H0 predicts, the larger the test statistic and the smaller the p-value.

It follows that:

  • A small p-value (e.g. 0.01) means your data would be unlikely if H0 were true, so it provides evidence against H0.
  • A large p-value (e.g. 0.40) means your data are quite compatible with H0, so there is little evidence against it.

A p-value is always a probability, so it lies between 0 and 1. The p-value is closely tied to the test statistic you compute and to its known sampling distribution under H0.

How Is the P-value Calculated?

You calculate a p-value in three conceptual steps:

  1. Compute a test statistic from your sample (for example, a t, z, F or chi-square value). The test statistic summarises how far your data fall from what H0 predicts.
  2. Locate it on the null sampling distribution. Under H0, the test statistic follows a known probability distribution (e.g. the t-distribution with a given number of degrees of freedom).
  3. Find the tail probability. The p-value is the area in the tail(s) of that distribution beyond your test statistic — one tail for a one-tailed test, both tails for a two-tailed test.

In practice you rarely do this by hand. Statistical software (SPSS, R, Stata, Python) or a spreadsheet returns the p-value directly, and printed p-value tables let you look it up from the test statistic and degrees of freedom. Two points are worth remembering:

  • Different statistical tests produce different test statistics, so the right test depends on your data type and the effect you want to detect. Our guide to choosing a statistical test can help.
  • Degrees of freedom (driven by your sample size and the number of parameters estimated) change how large a test statistic must be to yield a given p-value.
The p-value is the tail area beyond the observed test statistic.

Get statistical analysis help at an affordable price

We have:

  • An expert statistician will complete your work
  • Rigorous quality checks
  • Confidentiality and reliability
  • Any statistical software of your choice
  • Free Plagiarism Report

P-value vs Alpha (the Significance Level)

It is vital not to confuse the p-value with alpha (α), the significance level. They are different things that work together:

  • Alpha (α) is a threshold you choose before you collect or analyse data. It is the largest risk you are willing to accept of wrongly rejecting a true null hypothesis (a Type I error). Common choices are 0.05, 0.01 and 0.10.
  • The p-value is computed from your data after the test. It tells you how extreme your particular result is under H0.

The decision rule is simple: if p ≤ α, reject H0; if p > α, fail to reject H0. So alpha is the fixed bar and the p-value is the measurement you compare against it.

What a P-value Does NOT Mean

P-values are widely misunderstood, so be careful to avoid these common errors:

  • It is not the probability that H0 is true. The p-value is calculated assuming H0 is true, so it cannot also be the probability of H0.
  • It is not the probability that your result was “due to chance”. Chance is already built into the calculation as the null model.
  • 1 − p is not the probability that the alternative hypothesis is true. A p-value of 0.04 does not mean there is a 96% chance H1 is correct.
  • It does not measure effect size or importance. A very small p-value can come from a trivial effect in a huge sample, and a large, meaningful effect can give a non-significant p-value in a small sample.
  • It is not a pass/fail stamp. A result with p = 0.049 is not meaningfully stronger than p = 0.051; the 0.05 cut-off is a convention, not a law of nature.

When Is a P-value Statistically Significant?

To be statistically significant means the p-value is small enough — smaller than your chosen alpha — to reject the null hypothesis.

  • If p < 0.05 (with α = 0.05), the result is statistically significant. There is less than a 5% probability of observing data this extreme if H0 were true, so you reject H0 in favour of H1. This does not mean there is a 95% probability that H1 is true.
  • If p > 0.05, the result is not statistically significant. You fail to reject H0. Note the careful wording: you never “accept” the null hypothesis, you simply lack enough evidence to reject it.

The table below summarises how p-values are conventionally interpreted against a 0.05 (and stricter 0.01) threshold.

P-value Strength of evidence Decision (α = 0.05)
p > 0.05 Insufficient evidence against H0 Not statistically significant; fail to reject the null hypothesis.
p < 0.05 Moderate evidence against H0 Statistically significant; reject the null hypothesis in favour of the alternative.
p < 0.01 Strong evidence against H0 Highly statistically significant; reject the null hypothesis in favour of the alternative.

How to Use a P-value in Hypothesis Testing

Follow these three steps to use a p-value in hypothesis testing:

Step 1 — Set the significance level (α). Choose α before you analyse the data. It is commonly 0.10, 0.05 or 0.01, with 0.05 being the default in many fields.

Step 2 — Calculate the p-value. Run the appropriate test and obtain the p-value from software, a spreadsheet or p-value tables. In Microsoft Excel, the Data Analysis ToolPak (and functions such as T.TEST and CHISQ.TEST) returns p-values directly.

Step 3 — Compare and decide. If p ≤ α, there is enough evidence to reject H0; if p > α, you fail to reject it. Always report the actual p-value (e.g. “p = 0.03”) alongside the test statistic, not just “significant” or “not significant”, and interpret it together with the effect size and confidence interval.

Worked Example: One-Sample t-test

Example: A coffee chain claims its cups contain 250 ml on average. A trading-standards officer samples n = 25 cups and measures a sample mean of x̄ = 246 ml with a sample standard deviation of s = 8 ml. Is the fill significantly below 250 ml? Use α = 0.05.

  • Hypotheses: H0: μ = 250 ml; H1: μ < 250 ml (a one-tailed test).
  • Test statistic: t = (x̄ − μ0) ÷ (s ÷ √n) = (246 − 250) ÷ (8 ÷ √25) = −4 ÷ 1.6 = −2.5.
  • Degrees of freedom: df = n − 1 = 24.
  • P-value: the one-tailed area below t = −2.5 on the t-distribution with 24 df is approximately 0.0098 (about 0.01).
  • Decision: p ≈ 0.0098 < α = 0.05, so we reject H0. There is statistically significant evidence that the mean fill is below 250 ml.

Interpretation: if the cups really did average 250 ml, we would see a sample mean this low (or lower) only about 1 time in 100. That is surprising enough to reject the chain’s claim at the 5% level.

Struggling to interpret your p-values?

ResearchProspect to the rescue!

Our expert statisticians run the right test and explain every result, from p-values to effect sizes — explore our statistical analysis service.

Frequently Asked Questions

How do you calculate a p-value?

Compute a test statistic (such as a t or z value) from your sample, then find the probability of a value at least that extreme on the statistic’s null sampling distribution. That tail probability is the p-value. In practice, statistical software, Excel’s Data Analysis ToolPak, or p-value tables return it for you.

For a comparison of two groups use =T.TEST(array1, array2, tails, type), which returns the p-value directly. For chi-square tests use CHISQ.TEST, and for regression or ANOVA enable the Data Analysis ToolPak (File > Options > Add-ins) and read the p-value from its output table.

Compare the p-value with your significance level α. If p ≤ α you reject the null hypothesis; if p > α you fail to reject it. You never formally “accept” the null hypothesis — a large p-value only means there is not enough evidence to reject it.

A p-value of 1 (or close to it) means your data are exactly what the null hypothesis predicts, so there is no evidence against H0. It does not prove the null hypothesis is true; it simply shows your result is entirely unsurprising under it.

In biostatistics the p-value has the same meaning as elsewhere: the probability of data at least as extreme as observed, assuming no real effect (for example, no difference between a treatment and a placebo). Clinical studies often pair it with effect sizes and confidence intervals because a small p-value alone does not show that an effect is clinically important.

A smaller p-value gives stronger evidence against the null hypothesis, but it does not measure how large or important the effect is. With a very large sample, even a tiny, meaningless difference can produce a very small p-value, so always read the p-value alongside the effect size.

About Owen Ingram

Avatar for Owen IngramIngram is a dissertation specialist. He has a master's degree in data sciences. His research work aims to compare the various types of research methods used among academicians and researchers.

WhatsApp Live Chat