
In this article, I’ll briefly discuss methods and techniques for analysis of multivariate linear regression using R studio.
Generally speaking, multivariate linear regression is a statistical technique used to model the relationship between a set of independent variables and a single continuous dependent variable.
In R Studio, you can perform a multivariate linear regression analysis using the lm() function from the base R package. The syntax for the lm() function is as follows:
model <- lm(y ~ x1 + x2 + ... + xn, data = data_frame)
To test the regression’s result, we can use the function summary(), to obtain confidence interval, we can use confint() and to make predictions based upon the resulted model, we can use the predict() function.
It is important to keep in mind that before performing the regression, you need to make sure that the data meets the assumptions of linear regression, such as linearity, independence, homoscedasticity, and normality. You can check these assumptions using various diagnostic plots and statistical tests.
Also, one of R’s advantages, is its ability to integrate the analyses of dummy variables with no extra definition;
For example, we’ll use the IRIS dataset built in the R studio software, to predict Sepal Length of the dataset, by Sepal width and Species.
There’s the code for that:
library(tidyverse)
iris
model <- lm(Sepal.Length ~ Sepal.Width + Species, data = iris)
model %>% summary()
That is the result of the summary() function:

According to the result, it can be seen that the model is completely significant (F(3,146)=128.9, p<.001). It can also be seen that the all the variable in the model are significant, and thus, this is the linear regression equation:
$y = 2.25+0.80*Sepal.Width+1.46*Versicolor+1.95*Virginica$
** Note please that Versicolor and Virginica are 2 of the 3 categories of the variable $Species$.
It can also be seen, that the coefficient of determination is rather high, with a value of 0.7259. representing 72.59% of the variance in the dependent variable (Septal.length).