In this course, we will be using Pearson's \(r\) as a measure of the linear relationship between two quantitative variables. In a sample, we use the symbol \(r\). In a population, we use the Greek letter \(\rho\) ("rho"). Pearson's \(r\) can easily be computed using statistical software.
A measure of the direction and strength of the relationship between two variables. Properties of Pearson's rThe following table may serve as a guideline when evaluating correlation coefficients:
Absolute Value of \(r\) | Strength of the Relationship |
---|---|
0 - 0.2 | Very weak |
0.2 - 0.4 | Weak |
0.4 - 0.6 | Moderate |
0.6 - 0.8 | Strong |
0.8 - 1.0 | Very strong |
Exam = 4.154 + 6.661 Quiz
Note that the scale on both the x and y axes has changed. In addition to the correlation changing, the y-intercept changed from 4.154 to 70.84 and the slope changed from 6.661 to 1.632.
Exam = 70.84 + 1.632 Quiz
There are a number of different versions of the formula for computing Pearson's \(r\). You should get the same correlation value regardless of which formula you use. Note that you will not have to compute Pearson's \(r\) by hand in this course. These formulas are presented here to help you understand what the value means. You should always be using technology to compute this value.
First, we'll look at the conceptual formula which uses \(z\) scores. To use this formula we would first compute the \(z\) score for every \(x\) and \(y\) value. We would multiply each case's \(z_x\) by their \(z_y\). If their \(x\) and \(y\) values were both above the mean then this product would be positive. If their x and y values were both below the mean this product would be positive. If one value was above the mean and the other was below the mean this product would be negative. Think of how this relates to the correlation being positive or negative. The sum of all of these products is divided by \(n-1\) to obtain the correlation.
Pearson's r: Conceptual FormulaWhen we replace \(z_x\) and \(z_y\) with the \(z\) score formulas and move the \(n-1\) to a separate fraction we get the formula in your textbook: \(r=\frac\Sigma<\left(\frac
Again, you will not need to compute \(r\) by hand in this course. This example is meant to show you how \(r\) is computed with the intention of enhancing your understanding of its meaning. In this course, you will always be using Minitab or StatKey to compute correlations.
In this example we have data from a random sample of \(n = 9\) World Campus STAT 200 students from the Spring 2017 semester. WileyPlus scores had a maximum possible value of 100. Midterm exam scores had a maximum possible value of 50. Remember, the \(x\) and \(y\) variables do not need to be on the same metric to compute a correlation.
ID | WileyPlus | Midterm |
---|---|---|
A | 82 | 37 |
B | 100 | 47 |
C | 96 | 33 |
D | 96 | 36 |
E | 80 | 44 |
F | 77 | 35 |
G | 100 | 50 |
H | 100 | 49 |
I | 94 | 45 |
Minitab was used to construct a scatterplot of these two variables. We need to examine the shape of the relationship before determining if Pearson's \(r\) is the appropriate correlation coefficient to use. Pearson's \(r\) can only be used to check for a linear relationship. For this example I am going to call WileyPlus grades the \(x\) variable and midterm exam grades the \(y\) variable because students completed WileyPlus assignments before the midterm exam.