Question 1:

At the table below we can find the dependent variable Y vs the independent variable Y;

  • Find Pearson’s correlation coefficient.
  • Find the regression line for those variables.
  • Based upon what you’ve found, what’s the prediction for x=3?
  • What’s the contribution of the variable “x” for this regression equation?
  • What is the coefficient of determination for this model?
  • Would you say the linear regression in this case is a good fit? Why or why not?

Answer 1:

We’ll use this table for all the calculations:

  • That’s the Pearson correlation calculation:
  • That’s the calculation of the coefficients:

And so, the line is:

  • There’s the prediction for x=3:
  • The contribution of the variable “x” is 1.588. that means that for every growth of 1 unit in the value of x, there’s an increase of 1.588 units for the dependent value.
  • The coefficient of determination is r^2, which is, in this case, 0.9292^2=0.8634=86.34%

That means that about 86.34% of the predicted value’s variance, is explained by the “x” variable.

  • According to the result, it seems there’s quite a good linear fit between the two variables; It is known due to the high value of the coefficient of determination and the Pearson correlation, which are obviously related to each other.

Question 2:

As a part of a research its purpose is to find relationship between average education years of parents (x) and the education years of the oldest sibling (y), 80 families were examined, and this is the data provided:

  • Is there a significant relationship between X and Y?
  • Erin’s parents, studied on average 16.5 years. According to this model, how many years did Erin studied?
  • What’s the coefficient of determination of this model?
  • Is this model a good model? Explain why you chose that.

Answer 2:

  • That is the calculation of the Pearson correlation:

               According to the result, the relationship between X and Y is positive and rather low.

  • In order to find information about Erin’s years of education, we need to find the regression equation; for this we need to find the means and standard deviations of the variables, and with that – the regression equation.

These are the means and standard deviations:

And these are the regression coefficients:

Now we can set the regression equation:

Using this information, by this model, Erin’s predicted years of education are: 14.721 years.

  • The coefficient of determination is:

According to these results, the model isn’t a good model since its coefficient of determination is rather low, indicating that using this model, only 18.1% of the variance of the dependent variables is explained by this model.