Analyze the distribution of sample means and understand why normal distributions appear everywhere in statistics.
n ≥ 30 recommended for CLT
By CLT, sample mean X̄ follows a normal distribution:
X̄ ~ N(μ = 100, SE² = 2.7386²)
Standard Error (σ/√n)
2.7386
Z-score
1.8257
P(X̄ < 105)
96.61%
P(X̄ > 105)
3.39%
| Sample Size | Std Error (σ=15) | SE as % of σ | Distribution Width |
|---|---|---|---|
| 10 | 4.74 | 31.6% | Very wide |
| 30 | 2.74 | 18.3% | Moderate (CLT applies) |
| 50 | 2.12 | 14.1% | Narrow |
| 100 | 1.50 | 10% | Very narrow |
The Central Limit Theorem (CLT) is one of the most important theorems in statistics. It states that the distribution of sample means approaches a normal (bell curve) distribution, regardless of the underlying population distribution, provided the sample size is sufficiently large.
Key Implication: This explains why the normal distribution appears so frequently in real-world data. Even if individual measurements aren't normally distributed, averaging them produces a normal distribution. The rule of thumb: n ≥ 30 is usually sufficient for CLT to apply.
Scenario: Exam scores across all students have μ = 100, σ = 15. A professor randomly selects n = 30 student essays to grade. What's P(mean score < 105)?
Setup:
Calculation:
Interpretation: There's a 96.56% chance that a random sample of 30 scores averages below 105. The sample mean is unlikely to exceed 105 by much.
Why is n ≥ 30 recommended?
For most distributions, a sample size of 30 is empirically sufficient for the sample mean distribution to approximate normality. For highly skewed distributions, larger n may be needed.
Does the original population need to be normal?
No! That's the power of CLT. Even if the population is uniformly distributed, exponential, or bimodal, sample means will follow a normal distribution (given large enough n).
What's the standard error?
The standard error (SE = σ/√n) measures the variability of sample means. Larger samples reduce SE, making sample means cluster more tightly around the population mean.
How does doubling the sample size affect SE?
Doubling n multiplies SE by 1/√2 ≈ 0.707. So the standard error decreases by 29%. Reducing variability requires increasing n by a factor of 4 to halve SE.
Is there a minimum sample size for CLT?
While n = 30 is conventional, CLT begins working at smaller n for normal populations. For highly skewed data, n = 100+ may be needed for reliable normal approximation.
How does this relate to confidence intervals?
Confidence intervals use CLT to estimate population parameters. Knowing the sampling distribution (via CLT) lets us calculate margins of error and build intervals around sample means.
What if my sample mean is very different from μ?
A large deviation (high Z-score) suggests either rare chance or that your data doesn't match the assumed population parameters. This indicates a need to investigate further.
Can CLT apply to non-continuous data?
Yes. CLT applies to sample means computed from any distribution, including discrete data like counts, proportions, and binomial outcomes.
Related Tools
Required sample size.
Estimate population parameters.
Survey accuracy measure.
Single parameter value.
Sampling variability.
Sample statistic distribution.