Measure the linear relationship strength between two variables. Calculate correlation coefficient, coefficient of determination, and statistical significance.
Last updated: March 2026
Pearson Correlation Coefficient (r) measures the strength and direction of the linear relationship between two continuous variables. It ranges from -1 to +1, where +1 indicates perfect positive correlation, -1 indicates perfect negative correlation, and 0 indicates no linear relationship.
This is the most widely used correlation measure in statistics. It quantifies how well two variables move together linearly: when one increases, does the other tend to increase (positive) or decrease (negative)? The coefficient is symmetric—corr(X,Y) = corr(Y,X)—and dimensionless, allowing comparison across different measurement scales.
The coefficient of determination (R²) is the square of r and represents the proportion of variance in one variable explained by the other. For example, R² = 0.81 means 81% of Y's variation can be explained by its linear relationship with X. The t-statistic tests whether the correlation is significantly different from zero.
Study Hours vs. Test Scores
There's a very strong positive correlation (r = 0.946) between study hours and test scores. About 90% of the variation in test scores can be explained by study time. The high t-statistic indicates this relationship is highly statistically significant, not due to chance.
Measures linear relationship strength between two continuous variables. r ∈ [-1, 1]. r=1 is perfect positive linear correlation, r=-1 is perfect negative, r=0 is no linear relationship. Symmetric: corr(X,Y) = corr(Y,X).
r is the correlation coefficient. |r|≥0.9 = very strong, 0.7-0.9 = strong, 0.5-0.7 = moderate, 0.3-0.5 = weak, <0.3 = very weak/none. Positive r = both increase together; negative r = inverse relationship.
Coefficient of determination: R² = r². Represents proportion of variance in Y explained by X. R²=0.81 means 81% of Y's variation is explained by linear relationship with X. Ranges from 0 to 1.
Tests statistical significance of correlation. t = r√[(n-2)/(1-r²)]. Compared to t-distribution with n-2 degrees of freedom. Larger |t| or smaller p-value indicates more significant correlation (unlikely due to chance).
NO. Correlation only describes linear association. Could be X→Y, Y→X, Z→both, or pure coincidence. Always investigate underlying mechanism before claiming causation. Consider confounding variables and temporal ordering.
Pearson r may underestimate relationship strength for non-linear patterns. Solutions: plot a scatterplot first to check linearity, use Spearman rank correlation for monotonic relationships, or fit non-linear models. Always visualize data.
Assumes: (1) continuous variables, (2) linear relationship, (3) approximately bivariate normal distribution, (4) no extreme outliers. Violating these can distort r. Check scatterplot and residuals to verify assumptions hold.
No. Correlation requires covariance: r = Cov(X,Y)/(SDₓ × SDᵧ). Must compute Σ[(xᵢ−μₓ)(yᵢ−μᵧ)]. Different datasets with identical means/SDs can have vastly different correlations (see Anscombe's quartet).
Related Tools
Linear relationship strength.
Rank correlation.
Regression fit quality.
Parabolic curve fit.
Third-degree polynomial fit.
Higher degree curve fit.