The mean in statistics is the arithmetic average of a set of numbers: you add up every value and divide by how many values there are. For example, the mean of 4, 6 and 8 is (4 + 6 + 8) ÷ 3 = 6. It is the most widely used measure of central tendency because, when the conditions are right, it uses every value in your dataset and forms the foundation for a wide range of powerful statistical tests.
This guide explains what the mean is, how to calculate it step by step, the three main types of mean, how the mean differs from the median, when the mean can mislead you, and how to report it correctly in your dissertation or research paper. There are three main types you should know:
- Arithmetic mean – the everyday ‘average’, used for most measurement data.
- Weighted mean – used when some values count for more than others.
- Geometric mean – used for rates, ratios and growth over time.
“The mean, often referred to as the average, is the most commonly used measure of central tendency. It is calculated by summing all the values in a dataset and dividing by the number of values.” — Australian Bureau of Statistics, Statistical Language: Measures of Central Tendency
The Arithmetic Mean
The arithmetic mean is what most people mean when they say ‘the average’. You add up all the values in a dataset and divide by the number of observations. It is appropriate for interval and ratio data that is roughly symmetrical.
where Σx is the sum of all the values and n is the number of observations.
Worked Example: Calculating The Mean Step By Step
- Add every value (Σx): 48 + 55 + 61 + 64 + 68 + 73 + 80 = 449.
- Count the observations (n): there are 7 scores.
- Divide the sum by the count: 449 ÷ 7 = 64.14.
- Round sensibly: the mean test score is 64.1 (to one decimal place).
So the average score on this test was 64.1 marks.
Notice three things from this calculation. First, the mean does not have to be a value that actually appears in the dataset – here 64.1 is not one of the seven scores. Second, every observation contributes: change any single score and the mean changes. Third, the mean is expressed in the same units as the original data (marks), which makes it easy to interpret and compare against another group’s average.
Struggling to choose and report the right average?
ResearchProspect to the rescue!
Our statisticians can calculate, interpret and write up your means, standard deviations and tests correctly – explore our statistical analysis service.
Why The Arithmetic Mean Is Preferred For Normally Distributed Data
When data follows a normal (bell-shaped) distribution, the mean, median and mode are all approximately equal and sit at the centre of the curve. In this situation the mean is the most statistically efficient estimator of the population centre: it uses every data point and minimises the average squared distance from the centre. That property is exactly why it underpins parametric methods such as t-tests, ANOVA and inferential statistics in general.
The Weighted Mean
A standard arithmetic mean treats every observation as equally important. A weighted mean assigns different weights to different values before averaging, reflecting their relative importance, frequency or sample size.
where w is the weight attached to each value x.
The table below shows common situations where a weighted mean is the correct choice.
| Scenario | Why A Weighted Mean Is Needed | Example |
|---|---|---|
| Module grades | Modules carry different credit weights | A 60-credit dissertation weighted more than a 20-credit seminar |
| National income averages | Regional populations differ in size | London’s mean income given a higher weight than the Highlands’ |
| Survey responses | Response groups have unequal sizes | Oversampled minority groups re-weighted to population proportions |
| School league tables | Schools have different pupil numbers | Larger schools influence the national average more |
- Multiply each value by its weight: (71 × 0.4) = 28.4 and (58 × 0.6) = 34.8.
- Add the weighted values (Σwx): 28.4 + 34.8 = 63.2.
- Divide by the sum of the weights (Σw = 0.4 + 0.6 = 1): 63.2 ÷ 1 = 63.2.
The weighted mean is 63.2. A plain unweighted mean of (71 + 58) ÷ 2 = 64.5 would overstate the grade, because it ignores the fact that the exam counts for more.
The Geometric Mean
Less commonly taught at undergraduate level but worth knowing: the geometric mean is appropriate when the data represents rates, ratios or percentages – especially growth rates compounded over time. Instead of adding the values, you multiply them and take the n-th root.
Quick example: the geometric mean of 4 and 9 is √(4 × 9) = √36 = 6.
The geometric mean is particularly useful in finance (the average return on an investment over several periods) and in biology (population growth rates). When values span several orders of magnitude, or when each period multiplies on the last, the geometric mean is more representative than the arithmetic mean because it cannot be dragged upwards by a single large figure. As a worked illustration, an investment that grows by 100% in year one and then falls by 50% in year two ends exactly where it started: the geometric mean of the growth factors, (2 × 0.5)^(1/2) = 1, correctly reports a 0% average return, whereas the arithmetic mean of +100% and −50% would wrongly suggest +25%.
Mean vs Median: Which Should You Use?
The mean is the arithmetic average of all values; the median is the middle value when the data is ordered. The crucial difference is sensitivity to extreme values: the mean uses every number, so a single outlier shifts it, whereas the median only depends on the middle position and barely moves.
- Mean: (22,000 + 24,000 + 26,000 + 28,000 + 300,000) ÷ 5 = 400,000 ÷ 5 = £80,000.
- Median: the middle of the ordered list = £26,000.
Nobody in this group earns near £80,000 – the mean is distorted by one high earner, while the median (£26,000) describes the typical person far better. This is exactly why the Office for National Statistics reports median rather than mean household income.
| Feature | Mean | Median |
|---|---|---|
| Definition | Sum of values ÷ number of values | Middle value of the ordered data |
| Uses every data point? | Yes | No – only the central position |
| Affected by outliers? | Strongly | Barely |
| Best for | Symmetrical / normal data | Skewed data (income, house prices) |
| Used in | Parametric tests (t-test, ANOVA) | Non-parametric tests |
As a rule of thumb: use the mean for roughly symmetrical data and the median when the distribution is skewed or contains outliers.
The Biggest Weakness Of The Mean
The arithmetic mean is influenced by every data point. That is its strength in normal distributions and its weakness when data is skewed or contains outliers.
| Data Type | Mean Appropriate? | Better Alternative If Not |
|---|---|---|
| Normally distributed interval/ratio | Yes | – |
| Skewed ratio data (income, house prices) | Caution – report alongside the median | Median + IQR |
| Ordinal (Likert items, degree grades) | Debated – see note below | Median + non-parametric tests |
| Nominal (gender, blood type) | Never | Mode + frequency table |
For ordinal data specifically, the debate about whether the mean is appropriate is ongoing, because the gaps between ordered categories are not guaranteed to be equal. The choice of average always depends on your level of measurement.
The Mean And The Measures Of Variability
The mean rarely appears alone in a results section. It is almost always paired with the standard deviation, which tells your reader how spread out the data is around the mean. If the mean is 64 and the standard deviation is 2, the data points cluster tightly; if the standard deviation is 20, they are widely scattered.
Two datasets can share an identical mean yet describe completely different realities: a class where everyone scores around 64 and a class split between 30s and 90s both average 64, but only the standard deviation reveals that difference. This is why examiners expect the mean and a measure of spread to be reported together rather than the mean on its own.
Reporting The Mean In APA Style
APA 7th edition format:
- In text: M = 64.1, SD = 10.3
- In a table: use Mean (M) and standard deviation (SD) as column headers
- When comparing groups: report M and SD for each group before stating the result of your t-test or ANOVA