Calculate p-values from z-scores or t-statistics for hypothesis testing and statistical significance.
Last updated: March 2026
A p-value is the probability of obtaining test results at least as extreme as the observed results, assuming the null hypothesis is true. In simpler terms: it tells you how likely your data would occur by random chance alone.
P-values range from 0 to 1. A small p-value (typically ≤ 0.05) suggests that the observed data is unlikely under the null hypothesis, providing evidence to reject the null hypothesis. A large p-value (greater than 0.05) suggests the data is consistent with the null hypothesis.
This calculator works with both z-scores (for large samples or known population standard deviation) and t-statistics (for small samples with unknown population standard deviation). It supports two-tailed tests (looking for any difference), left-tailed tests (looking for decreases), and right-tailed tests (looking for increases).
Testing if a new teaching method affects student scores:
It means there's less than a 5% probability that your results occurred by random chance. This is considered statistically significant in most fields, giving you 95% confidence that the effect is real.
Use z-test when you have a large sample (n greater than 30) or know the population standard deviation. Use t-test for small samples (n at most 30) with unknown population standard deviation. The t-distribution accounts for additional uncertainty in small samples.
Two-tailed tests check for any difference. One-tailed tests check for a specific direction: right-tailed for greater-than hypotheses, left-tailed for less-than hypotheses. Two-tailed p-values are twice the one-tailed value.
No! A large p-value means you do not have enough evidence to reject the null hypothesis, but it does not prove it is true. It just means your data is consistent with the null hypothesis. Absence of evidence is not evidence of absence.
For a t-test, df is typically n minus 1 (sample size minus 1). It represents the number of independent values in your calculation that are free to vary. More degrees of freedom means the t-distribution becomes more like the normal distribution.
No! A statistically significant result (small p-value) means the effect is unlikely due to chance, but it does not tell you if the effect is large enough to matter in practice. Always consider effect size alongside p-values.
This is a borderline case. Some researchers reject the null at p = 0.05, others do not. The 0.05 threshold is arbitrary. Consider the context, effect size, and whether this is exploratory or confirmatory research.
P-hacking is running many analyses and only reporting the significant ones, or stopping data collection when p is less than 0.05. This inflates false positive rates because you are essentially running multiple tests without correction. Always pre-register your analysis plan.