Measures of variability are statistics that describe how spread out a set of data is, that is, how far the individual values lie from the centre and from each other. The four most common measures are the range, the interquartile range (IQR), the variance and the standard deviation. Together they tell you whether your data is tightly clustered or widely dispersed, which an average on its own can never reveal.
Understanding an average is only half the story in statistics. Two data sets can share the same mean or median yet behave very differently in real life. This is why measures of variability are a core topic in the maths and statistics curriculum, from GCSE and A-level through to undergraduate research.
In practical UK contexts, such as analysing exam results across schools, comparing house prices by region, or studying the distribution of income, variability reveals whether values are tightly clustered or widely dispersed. Without it, conclusions drawn from averages alone can be badly misleading. For example, two towns can have an identical average household income of £35,000, yet one may have everyone earning close to that figure while the other has a mix of very high and very low earners. The averages match; the realities do not.
“Measures of variability describe the spread of values in a dataset.” — Australian Bureau of Statistics, Statistical Language
What Is Variability & Why Does It Matter?
Variability (also called dispersion or spread) describes how much individual data points differ from one another and from the centre of the distribution. A dataset in which all values are very similar has low variability; one in which values are scattered across a wide range has high variability. There are two broad families: distance-based measures (range and IQR), which look at the gap between particular values, and deviation-based measures (variance and standard deviation), which average how far every value sits from the mean.
Variability matters for three main reasons:
- It affects the precision of your central tendency measure: a mean drawn from a tightly clustered dataset is more reliable than one drawn from a widely scattered dataset.
- It determines which statistical tests are appropriate. Tests that assume homogeneity of variance (equal spread across groups) give misleading results if that assumption is violated.
- It is often the variable of interest itself. In quality control, manufacturing, or clinical settings, consistency is as important as average performance.
Looking for statistical analysis help?
ResearchProspect to the rescue!
Our expert statisticians can calculate, interpret and report the right measures of spread for your study — explore our statistical analysis service.
The Four Measures at a Glance
The table below summarises the four main measures of variability, their formulae, and when each is most useful. Note that the variance and standard deviation formulae shown use the sample denominator (n − 1); when describing a complete population you divide by n instead.
| Measure | Formula (sample) | Sensitive to Outliers | Best Paired With | Appropriate Data Level |
|---|---|---|---|---|
| Range | Max − Min | Very high | Median (as context) | Ordinal, Interval, Ratio |
| Interquartile Range (IQR) | Q3 − Q1 | No | Median | Ordinal, Interval, Ratio |
| Variance (s²) | Σ(x − x̄)² ÷ (n − 1) | Yes | Mean | Interval, Ratio |
| Standard Deviation (s) | √[Σ(x − x̄)² ÷ (n − 1)] | Yes | Mean | Interval, Ratio |
Range
The range is the simplest measure of spread: the highest value minus the lowest value. It gives the total span of your data and is very quick to calculate.
Its weakness is that it depends entirely on the two most extreme values and is therefore highly sensitive to outliers. A single anomalous data point can double the range without changing anything else in the dataset. For this reason the range is best used as a rough initial summary, always alongside a more robust measure, rather than as a standalone description of spread.
Interquartile Range (IQR)
The interquartile range is the range of the middle 50% of your data, the distance between the 25th percentile (Q1, the lower quartile) and the 75th percentile (Q3, the upper quartile): IQR = Q3 − Q1. Because it ignores the top 25% and bottom 25% of values, it is resistant to outliers.
The IQR is the natural companion of the median. When data is skewed or non-normal, you should report the median and IQR rather than the mean and standard deviation. The IQR is also the basis of the standard rule for flagging outliers: any value below Q1 − 1.5×IQR or above Q3 + 1.5×IQR is treated as a potential outlier, and it is the box (the interquartile range) you see in a box-and-whisker plot.
- The median (Q2) is the 5th value = 10.
- The lower half is 3, 5, 7, 8, so Q1 = (5 + 7) ÷ 2 = 6.
- The upper half is 12, 14, 18, 21, so Q3 = (14 + 18) ÷ 2 = 16.
- IQR = Q3 − Q1 = 16 − 6 = 10.
So the middle 50% of values span 10 units, regardless of how extreme the smallest or largest figures are. (Different software uses slightly different quartile conventions, so SPSS or Excel may report marginally different quartiles for small samples.)
Variance
The variance measures the average squared distance of each data point from the mean. Squaring the differences serves two purposes: it stops values above and below the mean from cancelling each other out, and it gives extra weight to larger deviations. The result is a measure of spread expressed in squared units, which is mathematically convenient but harder to interpret intuitively.
For a sample you divide the sum of squared deviations by (n − 1) rather than n. This is known as Bessel’s correction, and it gives an unbiased estimate of the population variance from a sample. For a whole population you divide by n.
- Step 1 – Mean: (4 + 8 + 6 + 5 + 3 + 7 + 9 + 2) ÷ 8 = 44 ÷ 8 = 5.5.
- Step 2 – Deviations from the mean: −1.5, 2.5, 0.5, −0.5, −2.5, 1.5, 3.5, −3.5.
- Step 3 – Square them: 2.25, 6.25, 0.25, 0.25, 6.25, 2.25, 12.25, 12.25.
- Step 4 – Sum of squares: 2.25 + 6.25 + 0.25 + 0.25 + 6.25 + 2.25 + 12.25 + 12.25 = 42.
- Step 5 – Sample variance: 42 ÷ (8 − 1) = 42 ÷ 7 = 6.
- Step 6 – Standard deviation: √6 ≈ 2.45.
If this had been treated as a full population, you would divide by 8 instead of 7, giving a variance of 5.25 and a standard deviation of about 2.29.
Variance is the foundation of many statistical methods, particularly ANOVA (Analysis of Variance), which compares the variance between groups with the variance within groups. For reporting and communication, however, the standard deviation is usually preferred because it is in the original units.
Standard Deviation
The standard deviation (SD) is the square root of the variance. Taking the square root brings the measure back to the original units of the data, so if you are measuring test scores in points, the SD is in points rather than points-squared. It represents the typical distance of a data point from the mean, which is why it is the most widely reported measure of spread in academic research.
In a normal distribution, the standard deviation has a precise interpretation known as the empirical (68–95–99.7) rule:
- About 68% of values fall within one SD of the mean.
- About 95% fall within two SDs (more precisely 1.96 SDs).
- About 99.7% fall within three SDs.
This makes the SD essential for interpreting the normal distribution and for calculating confidence intervals. It should not be confused with the standard error: the standard deviation describes the spread of individual data points, whereas the standard error describes the spread of a sample statistic (such as the sample mean) and equals the SD divided by √n. As your sample grows, the standard error shrinks, but the standard deviation does not.
“In a normal distribution, approximately 68% of observations lie within one standard deviation of the mean, and about 95% within two standard deviations.” — The empirical (68–95–99.7) rule
Choosing the Right Measure
No single measure of variability is best for every situation. The right choice depends on the shape of your distribution, the level of measurement of your data, and whether outliers are present. Use the table below as a quick decision guide.
| Situation | Recommended Measure | Reasoning |
|---|---|---|
| Normal or symmetric distribution | Standard deviation | Efficient and widely understood; underpins parametric tests |
| Skewed data (income, house prices) | IQR | Robust against outliers; gives an honest picture of spread |
| Ordinal data (satisfaction ratings) | IQR | Standard deviation assumes equal intervals, which ordinal data lacks |
| Quick initial summary | Range | Simplest calculation; should always be supplemented with IQR or SD |
| Statistical tests (ANOVA, regression) | Variance | Used internally by the tests; report SD for reader interpretation |
| Comparing spread across different units | Coefficient of variation (CV = SD ÷ Mean) | Ratio data only; enables meaningful cross-dataset comparison |
Variability & Statistical Testing
Understanding variability is not only about describing your data, it is fundamental to statistical inference. Many parametric tests assume that groups have similar variances (an assumption called homogeneity of variance, or homoscedasticity). Levene’s test in SPSS checks this assumption before a t-test or ANOVA. If the variances are unequal, a correction such as Welch’s t-test or Welch’s ANOVA is used instead.
Variability also feeds directly into the test statistic itself. A t-value, for example, is essentially the difference between group means divided by a measure of variability (the standard error). The smaller the spread relative to the difference between means, the larger the test statistic and the stronger the evidence against the null hypothesis. In short, variability is the denominator against which every observed effect is judged. If you would like a refresher on the full toolkit of summary statistics, see our guide to descriptive statistics.