Hypothesis Testing Calculator

Statistics

Hypothesis Testing Calculator

Conduct z-tests and t-tests to evaluate claims about population parameters using sample data.

HYPOTHESIS TEST DECISION
Fail to reject H₀
p-value (0.067889) ≥ α (0.05) — not significant.
Test Statistic
1.8257
p-value
0.067889
α Level
0.0500

What is Hypothesis Testing?

Hypothesis testing is a statistical procedure to evaluate claims about populations using sample data. You start with a null hypothesis (H₀) representing no effect, test it against an alternative hypothesis (H₁), and decide whether evidence supports rejecting the null.

  • Null Hypothesis (H₀): Assumes no difference or effect (e.g., μ = 100)
  • Alternative Hypothesis (H₁): The claim being tested (e.g., μ ≠ 100)
  • Test Statistic: Measures how far sample data deviates from H₀ under the assumed distribution
  • p-value: Probability of observing test statistic as extreme or more extreme, assuming H₀ is true
  • Significance Level (α): Threshold for rejecting H₀; if p < α, reject H₀

How to Use This Calculator

  1. Select Test Type: Choose Z-test (mean/proportion) or T-test based on whether population SD is known
  2. Choose Tail Type: Two-tailed for μ ≠ μ₀, left-tailed for μ < μ₀, right-tailed for μ > μ₀
  3. Set α: Typically 0.05 (5% significance level)
  4. Enter Data: Sample statistics from your data (mean, proportion, SD, size)
  5. Specify H₀: Enter the hypothesized population parameter
  6. Review Results: Compare p-value to α to make your decision

Example: Testing a New Treatment

A pharmaceutical company claims a new medication has an average effect score of 100. A clinical trial tests 30 patients, finding a mean effect of 105 with SD = 15.

H₀: μ = 100 (no improvement over baseline)
H₁: μ ≠ 100 (two-tailed; improvement or deterioration)
α = 0.05, n = 30, x̄ = 105, σ = 15
t-statistic = (105 - 100) / (15/√30) ≈ 1.826
p-value ≈ 0.079
Decision: p (0.079) > α (0.05) → Fail to reject H₀

There is insufficient evidence to conclude the medication differs from the baseline effect of 100 at the 0.05 significance level.

Frequently Asked Questions

What is the difference between Z-test and T-test?
Z-test is used when population SD is known or sample size is large (n > 30). T-test is used when population SD is unknown and must be estimated from the sample. T-distributions have heavier tails, making the test more conservative for small samples.
What does a small p-value mean?
A small p-value (e.g., < 0.05) indicates the observed data is unlikely under the null hypothesis. This provides strong evidence against H₀, leading to rejection. It does NOT prove H₁ is true; it only suggests the null is implausible.
Why do we use significance levels like 0.05?
The choice of α = 0.05 is a convention balancing Type I error (false positive) risk. Lowering α (e.g., 0.01) reduces false positives but increases Type II error (false negatives). Different fields use different thresholds based on consequences.
What is a two-tailed vs. one-tailed test?
Two-tailed tests check if a parameter differs in either direction (H₁: ≠). One-tailed tests check only one direction: left-tail (H₁: <) or right-tail (H₁: >). Two-tailed tests are more conservative; at the same α, they have lower power but protect against effects in both directions.
Can I change α to make results significant?
Choosing α after seeing results (p-hacking) is statistically unethical and invalidates conclusions. Always set α before conducting the test. Changing α post-hoc inflates Type I error rates, making false positives more likely.
Does "fail to reject H₀" mean H₀ is true?
No. Failing to reject H₀ means there is insufficient evidence to conclude H₁. It does not prove H₀ is true; absence of evidence is not evidence of absence. Larger sample sizes or better data collection might reveal a true effect.
What is the relationship between confidence intervals and hypothesis tests?
A two-tailed hypothesis test at level α is equivalent to checking if the hypothesized parameter falls outside a (1-α)×100% confidence interval. If the hypothesized value is inside the CI, fail to reject H₀; if outside, reject H₀.
How does sample size affect hypothesis testing?
Larger samples increase statistical power (ability to detect true effects) and tighten confidence intervals. With very large n, even trivial differences may be statistically significant. Conversely, small samples have low power and may miss true effects (Type II error).

Related Tools