Every quantitative research project starts with a description. Before you test hypotheses, run regressions, or interpret p-values, you need to actually understand your data, what it looks like, where it is centred, how spread out it is, and whether it has any unusual features. That is what descriptive statistics are for.
What Are Descriptive Statistics
Descriptive statistics summarise and describe the features of a dataset. They do not draw conclusions about a broader population, that is the job of inferential statistics. They give you, and your reader, a clear picture of what the data actually contains.
Looking for statistical analysis help?
Research Prospect to the rescue then!
We have expert writers on our team who are skilled at helping students with their research across a variety of disciplines. Guaranteeing 100% satisfaction!
| Type | What It Describes | Examples |
|---|---|---|
| Central tendency | Where the data is centred (the typical value) | Mean, median, mode |
| Variability (spread) | How spread out the data is around the centre | Range, IQR, standard deviation, variance |
| Distribution shape | The pattern and symmetry of the distribution | Skewness, kurtosis, normality tests |
| Frequency | How many observations fall in each category | Frequency counts, percentages, cumulative % |
Measures Of Central Tendency
Here are the measures of central tendency.
Mean
The arithmetic average, sum all values and divide by the number of observations. Uses every data point, making it sensitive and precise for normally distributed data. Its weakness is that extreme values (outliers) pull it away from where most data is concentrated.
Example: Graduate salaries in a sample of 10 University of Manchester alumni: £24k, £26k, £27k, £28k, £29k, £31k, £32k, £35k, £38k, £95k. Mean = £36,500. Median = £30,000. The single high earner pulls the mean £6,500 above the median, which is why median salary is the more honest representation here.
Median
The middle value when all observations are sorted from lowest to highest. Resistant to outliers, which makes it the preferred measure for skewed distributions. The ONS uses median earnings rather than mean earnings in official UK labour market statistics for exactly this reason.
Mode
The most frequently occurring value or category. The only appropriate measure of central tendency for nominal data. Useful for any data where you want to know the most common response. A dataset can have two modes (bimodal) or more (multimodal), which is itself informative, it often signals two distinct subgroups in the data.
| Data Type | Best Central Tendency Measure | Why? |
|---|---|---|
| Nominal | Mode | Only measure that does not require ordering or equal intervals |
| Ordinal | Median | Resistant to unequal gaps between ranks |
| Interval (symmetric) | Mean | Uses all information; appropriate when intervals are equal |
| Interval / Ratio (skewed) | Median (report both) | Outliers distort the mean; median is more representative |
| Ratio (symmetric) | Mean | Uses all information; ratio statements add interpretive value |
Measures Of Variability
The centre alone tells you very little. A mean of 50 on a test where everyone scores between 48 and 52 tells a completely different story from a mean of 50 where scores range from 10 to 90.
Range
Maximum minus minimum. Simple but easily distorted by a single extreme value. Rarely the only measure of spread you should report.
Interquartile Range (IQR)
The range of the middle 50% of data, calculated as Q3 minus Q1. Resistant to outliers. Report alongside the median when data is skewed or non-normal.
Standard Deviation (SD)
The most widely used measure of spread for interval and ratio data. It tells you the average distance of data points from the mean. A small SD means tight clustering; a large SD means wide spread. Report alongside the mean.
Variance
The square of the standard deviation. Used in many statistical calculations (notably ANOVA) but less intuitive to report on its own because it is in squared units.
| Measure | Appropriate For | Resistant to Outliers | Common Pairing |
|---|---|---|---|
| Range | Quick summary only | No | Often supplemented with IQR |
| IQR | Ordinal or skewed ratio data | Yes | Reported with median |
| Standard deviation | Interval / ratio data, normal distribution | No | Reported with mean |
| Variance | Statistical computations | No | Rarely reported standalone |
Distribution Shape: Skewness & Kurtosis
Beyond centre and spread, the shape of your distribution matters, particularly for deciding whether parametric tests are appropriate.
- Positive skew (right skew): long tail to the right. Mean > Median. Common in income, response times, NHS waiting times.
- Negative skew (left skew): long tail to the left. Median > Mean. Less common; can appear with test scores on easy tests.
- Kurtosis: describes the ‘peakedness’ of a distribution and the weight in the tails. High kurtosis means heavier tails (more extreme values).
In SPSS, skewness and kurtosis values are produced by Analyze > Descriptive Statistics > Explore. As a rough rule, values of skewness between -1 and +1 are generally considered acceptably normal for parametric tests. Values beyond ±2 are more concerning.
Student Tip: Normality tests like Shapiro-Wilk and Kolmogorov-Smirnov are available in SPSS, but with large samples they almost always reject normality even when the violation is trivial. For samples above n=50, look at histograms and skewness values alongside normality tests, do not rely solely on the p-value from a normality test.
Reporting Descriptive Statistics In APA Format
APA 7th edition (used in most UK universities) has specific conventions:
- Mean and standard deviation: M = 42.3, SD = 8.1
- Median: Mdn = 38.0
- Range: Range = 12–68
- IQR: IQR = 22.5
- For frequency data: ‘Of the 120 participants, 48 (40%) were male, 66 (55%) were female, and 6 (5%) identified as non-binary.’
Always include a descriptive statistics table in your results section when you have multiple variables. It gives context for all the inferential tests that follow.
Frequently Asked Questions
The main purpose of descriptive statistics is to summarize and present data in a meaningful way. It helps in understanding the central tendency, dispersion, and shape of data distribution, making complex data sets more interpretable and providing insights for decision-making and analysis.
Always. Descriptive statistics, especially means, standard deviations, and sample sizes per group, are essential context for interpreting inferential results. Many journals require descriptive statistics to be reported alongside every inferential test.
Outliers primarily affect the mean and standard deviation, pulling them toward extreme values. The median and IQR are resistant. When you identify outliers, report both sets of statistics, note the discrepancy, and explain how you handled the outliers in your analysis.
It affects your test choices. Substantially skewed or non-normal data typically calls for non-parametric tests (Mann-Whitney U instead of t-test; Spearman’s instead of Pearson’s), or data transformation (log transform) before parametric tests. It does not mean your data is flawed, many real-world variables are naturally non-normal.