Pearson Correlation Calculator

Pearson Correlation Calculator

Measure the linear relationship strength between two variables. Calculate correlation coefficient, coefficient of determination, and statistical significance.

Last updated: March 2026

Pearson Correlation (r)
0.951190
Very strong positive
R² (determination)
0.904762
t-statistic
7.5498
Sample size (n)
8
Mean X (μₓ)
4.5000
Mean Y (μᵧ)
5.5000
SD X (σₓ)
2.4495

What is Pearson Correlation?

Pearson Correlation Coefficient (r) measures the strength and direction of the linear relationship between two continuous variables. It ranges from -1 to +1, where +1 indicates perfect positive correlation, -1 indicates perfect negative correlation, and 0 indicates no linear relationship.

This is the most widely used correlation measure in statistics. It quantifies how well two variables move together linearly: when one increases, does the other tend to increase (positive) or decrease (negative)? The coefficient is symmetric—corr(X,Y) = corr(Y,X)—and dimensionless, allowing comparison across different measurement scales.

The coefficient of determination (R²) is the square of r and represents the proportion of variance in one variable explained by the other. For example, R² = 0.81 means 81% of Y's variation can be explained by its linear relationship with X. The t-statistic tests whether the correlation is significantly different from zero.

How to Use This Calculator

Step-by-Step Guide

1
Enter X values: Input your first variable's data points, separated by commas, spaces, or newlines. These can be in any numeric order.
2
Enter Y values: Input your second variable's data points in the same format. Must have at least 3 paired observations (X and Y values must align).
3
Automatic calculation: Results update automatically. The calculator computes r, R², t-statistic, and summary statistics for both variables.
4
Interpret results: Check r for strength/direction, R² for explained variance, and t-statistic for significance (larger |t| = more significant).

Key Formulas

r = Σ[(xᵢ − μₓ)(yᵢ − μᵧ)] / [(n−1) × σₓ × σᵧ]
R² = r²
t = r × √[(n−2)/(1−r²)] with df = n−2
Interpretation: |r| ≥ 0.9 = very strong, 0.7-0.9 = strong, 0.5-0.7 = moderate, 0.3-0.5 = weak, <0.3 = very weak

Example Calculation

Study Hours vs. Test Scores

Data:
X (study hours): [1, 2, 3, 4, 5, 6, 7, 8]
Y (test scores): [2, 4, 5, 4, 5, 7, 8, 9]
n = 8
μₓ = 4.5, μᵧ = 5.5
σₓ ≈ 2.29, σᵧ ≈ 2.34

Covariance = Σ[(xᵢ−μₓ)(yᵢ−μᵧ)]/(n−1) ≈ 5.07
r = Cov/(σₓ × σᵧ) = 5.07/(2.29 × 2.34) ≈ 0.946
R² ≈ 0.895 → 89.5% of variance explained
t ≈ 7.37 with df=6 → Highly significant!
Interpretation:

There's a very strong positive correlation (r = 0.946) between study hours and test scores. About 90% of the variation in test scores can be explained by study time. The high t-statistic indicates this relationship is highly statistically significant, not due to chance.

Frequently Asked Questions

What is Pearson correlation?

Measures linear relationship strength between two continuous variables. r ∈ [-1, 1]. r=1 is perfect positive linear correlation, r=-1 is perfect negative, r=0 is no linear relationship. Symmetric: corr(X,Y) = corr(Y,X).

What does r mean?

r is the correlation coefficient. |r|≥0.9 = very strong, 0.7-0.9 = strong, 0.5-0.7 = moderate, 0.3-0.5 = weak, <0.3 = very weak/none. Positive r = both increase together; negative r = inverse relationship.

What is R²?

Coefficient of determination: R² = r². Represents proportion of variance in Y explained by X. R²=0.81 means 81% of Y's variation is explained by linear relationship with X. Ranges from 0 to 1.

What's the t-statistic for?

Tests statistical significance of correlation. t = r√[(n-2)/(1-r²)]. Compared to t-distribution with n-2 degrees of freedom. Larger |t| or smaller p-value indicates more significant correlation (unlikely due to chance).

Does correlation imply causation?

NO. Correlation only describes linear association. Could be X→Y, Y→X, Z→both, or pure coincidence. Always investigate underlying mechanism before claiming causation. Consider confounding variables and temporal ordering.

What if my data isn't linear?

Pearson r may underestimate relationship strength for non-linear patterns. Solutions: plot a scatterplot first to check linearity, use Spearman rank correlation for monotonic relationships, or fit non-linear models. Always visualize data.

What are the assumptions?

Assumes: (1) continuous variables, (2) linear relationship, (3) approximately bivariate normal distribution, (4) no extreme outliers. Violating these can distort r. Check scatterplot and residuals to verify assumptions hold.

Can I calculate r from just means and SDs?

No. Correlation requires covariance: r = Cov(X,Y)/(SDₓ × SDᵧ). Must compute Σ[(xᵢ−μₓ)(yᵢ−μᵧ)]. Different datasets with identical means/SDs can have vastly different correlations (see Anscombe's quartet).

Related Tools