What Multiple Linear Regression (MLR) Means
Multiple linear regression (MLR) is a statistical technique that models the relationship between two or more independent variables and a single continuous dependent variable by fitting a linear equation to the observed data. It extends simple linear regression—which uses just one predictor—so that several explanatory variables can be used together to predict, and explain variation in, one outcome.
The general form of the model is y = b0 + b1x1 + b2x2 + … + bkxk + e, where y is the dependent variable, x1 … xk are the independent (predictor) variables, b0 is the intercept, each b is a regression coefficient, and e is the error term. Each coefficient tells you how much y is expected to change for a one-unit increase in that predictor, while all the other predictors are held constant.
Simple vs multiple linear regression
- One predictor: y = b₀ + b₁x
- Several predictors: y = b₀ + b₁x₁ + … + bₖxₖ
So if you have three independent variables and one dependent variable, you simply extend the equation to y = b0 + b1x1 + b2x2 + b3x3. The number of predictors (often written as k) is flexible; the model stays “linear” because y is a linear combination of the coefficients—not because every predictor must move in a straight line on its own.
“Multiple linear regression (MLR), also known simply as multiple regression, is a statistical technique that uses several explanatory variables to predict the outcome of a response variable.” — Investopedia, “Multiple Linear Regression (MLR)”
Difference Between Simple and Multiple Regression
The core difference is the number of independent variables. Simple linear regression uses one predictor and fits a straight line; multiple linear regression uses two or more predictors and fits a plane (with two predictors) or a hyperplane (with more). The table below summarises the key distinctions.
| Feature | Simple Linear Regression | Multiple Linear Regression |
|---|---|---|
| Independent variables | One (x) | Two or more (x1, x2, … xk) |
| Dependent variables | One (continuous) | One (continuous) |
| Equation | y = b0 + b1x + e | y = b0 + b1x1 + … + bkxk + e |
| Geometry of fit | A best-fit line | A best-fit plane / hyperplane |
| Coefficient meaning | Effect of x on y | Effect of each x on y, holding the others constant |
| Goodness of fit | R² | R² and adjusted R² |
| Extra concern | — | Multicollinearity between predictors |
If you are still deciding which method your data calls for, our guide on which statistical test you should use walks through the decision step by step.
Examples of Multiple Linear Regression
In each example below, note which variables predict (independent) and which is predicted (dependent):
- Predicting a person’s weight (dependent variable) from their daily calorie intake and hours of exercise per week (two independent variables).
- Predicting mental well-being (dependent variable) from monthly income and quality of work environment (two independent variables) for a particular group of people.
- Predicting crop yield (dependent variable) from rainfall, temperature and fertiliser use (three independent variables).
In every case, MLR estimates how each predictor relates to the outcome while accounting for the others—which is exactly what makes it more powerful than running several separate simple regressions. For a gentle conceptual lead-in to the whole topic, see our beginner’s guide to regression analysis.
Assumptions (Pre-Suppositions) of MLR
Multiple linear regression rests on a set of assumptions. If they are badly violated, the coefficients and p-values can be misleading. Your data should meet the following criteria:
- Linearity: the relationship between each independent variable and the dependent variable is linear.
- Independence of observations: the observations (and the residuals) are independent of one another—data collected with valid, reliable methods and no hidden links between cases.
- Homoscedasticity (homogeneity of variance): the variance of the residuals is roughly constant across all fitted values—the spread of errors does not fan out or shrink as predictions get larger.
- Normality of residuals: the errors (residuals) are approximately normally distributed. It is the residuals that need to be normal, not the raw predictors.
- No (or low) multicollinearity: the independent variables should not be too strongly correlated with one another. When two predictors move together—such as rainfall and humidity—it becomes hard to separate their individual effects, and the coefficients become unstable.
You will normally also want two or more independent variables, each of which may be continuous (an interval or ratio variable) or categorical (entered as dummy/indicator variables, for example gender coded as 0 and 1).
Need help running and interpreting a regression model?
- Get expert statistical support from ResearchProspect today.
- However challenging your dataset, our analysts are here to help.
The Multiple Linear Regression Formula
The estimated multiple linear regression equation is written as:
ŷ = b0 + b1x1 + b2x2 + … + bkxk
Where:
- ŷ = the predicted value of the dependent variable
- b0 = the intercept—the predicted value of y when every independent variable equals 0 (the point where the regression surface crosses the y-axis)
- x1, x2, … xk = the values of the independent (predictor) variables
- b1, b2, … bk = the regression (slope) coefficients—the amount by which y is predicted to change for a one-unit increase in that predictor, holding all other predictors constant
- e = the error term (residual), the part of y the model does not explain
The full population model includes the error term (y = b0 + b1x1 + … + bkxk + e); the fitted equation above drops it because ŷ is the model’s prediction. The closely related concept of how much that prediction varies is captured by the standard error of the estimate.
How to Calculate Multiple Regression (Step by Step)
For a model with two predictors, the coefficients can be calculated by hand using the following sums. Once your data is collected and sorted, follow these steps:
- Set up columns for y, x1 and x2 and enter the values for each observation.
- Square each predictor to get x1² and x2².
- Cross-multiply each predictor with y to get x1y and x2y.
- Multiply the two predictors together to get x1x2.
- Work in deviation (mean-centred) form, then compute the regression sums of squares and cross-products: Σx1², Σx2², Σx1y, Σx2y and Σx1x2.
- Calculate the slope coefficients using:
- b1 = [(Σx2²)(Σx1y) − (Σx1x2)(Σx2y)] / [(Σx1²)(Σx2²) − (Σx1x2)²]
- b2 = [(Σx1²)(Σx2y) − (Σx1x2)(Σx1y)] / [(Σx1²)(Σx2²) − (Σx1x2)²]
(Here Σx1², Σx2², Σx1y and Σx2y are sums of squares and cross-products computed in deviation form—that is, after subtracting each variable’s mean.)
- Calculate the intercept using the variable means: b0 = ȳ − b1x̄1 − b2x̄2 (where ȳ, x̄1 and x̄2 are the means of y, x1 and x2).
- Assemble the equation: ŷ = b0 + b1x1 + b2x2.
By hand, this is only practical for two predictors and a handful of cases. For real datasets, the matrix solution b = (XᵀX)⁻¹Xᵀy is used internally by tools such as Excel, SPSS, R and Python—which is why almost everyone runs MLR in software rather than on paper.
Worked Example with a Coefficients Table
ŷ = 30 + 4.2x1 + 1.5x2
The coefficients table looks like this:
| Term | Coefficient (b) | Std. Error | t-value | p-value |
|---|---|---|---|---|
| Intercept (b0) | 30.0 | 5.10 | 5.88 | < 0.001 |
| Hours studied (b1) | 4.2 | 0.70 | 6.00 | < 0.001 |
| Classes attended (b2) | 1.5 | 0.55 | 2.73 | 0.012 |
How to read it: Holding classes attended constant, each additional hour of study is associated with a 4.2-point increase in exam score. Holding hours studied constant, each extra class attended is associated with a 1.5-point increase. The intercept of 30 is the predicted score for a student with zero study hours and zero classes attended. Both predictors are statistically significant (p < 0.05).
Making a prediction: for a student who studied 10 hours and attended 12 classes:
ŷ = 30 + 4.2(10) + 1.5(12) = 30 + 42 + 18 = 90. The model predicts a final score of about 90.
Interpreting Coefficients, R² and Adjusted R²
The single most important idea when reading an MLR model is the phrase “holding the other variables constant” (sometimes called “ceteris paribus” or the variable’s partial effect). Each slope coefficient measures the effect of its own predictor on y after the influence of all the other predictors has been accounted for. That is why a predictor’s coefficient in an MLR model can differ—sometimes dramatically—from its coefficient in a simple regression that ignores the others.
To judge how well the whole model fits, two related measures are used:
- R² (coefficient of determination): the proportion of the variation in y that is explained by the predictors together, ranging from 0 to 1. An R² of 0.72 means the model explains 72% of the variation in the outcome. A known limitation is that R² always increases (or stays the same) when you add another predictor—even a useless one.
- Adjusted R²: a corrected version that penalises the model for each extra predictor, so it only rises when a new variable improves the model more than chance would predict. Because of this, adjusted R² is the fairer measure when comparing models with different numbers of predictors, and it can be lower than R².
Always read the coefficients alongside their standard errors and p-values: a large coefficient with a high p-value may simply be noise, while the standard error tells you how precisely each coefficient has been estimated.
Multicollinearity
Multicollinearity occurs when two or more independent variables are highly correlated with each other. Because the predictors carry overlapping information, the model struggles to assign credit to each one, and the result is unstable, imprecise coefficients with inflated standard errors—estimates can swing wildly (and even flip sign) when a variable is added or removed. The overall R² and the model’s predictions may still be fine; it is the individual coefficient interpretations that become unreliable.
Common ways to detect and handle it:
- Correlation matrix: check pairwise correlations between predictors; very high values (e.g. above 0.8–0.9) are a warning sign.
- Variance Inflation Factor (VIF): the standard diagnostic. A common rule of thumb flags a VIF above 5 (and certainly above 10) as problematic.
- Remedies: drop one of the redundant predictors, combine related predictors into a single index, or collect more data. As a rule, when two predictors are essentially measuring the same thing, keep only one.
Struggling to build or interpret your regression model?
ResearchProspect to the rescue!
Our statisticians can run, validate and explain your multiple linear regression analysis—see our statistical analysis service to get started.





