Hypergeometric Distribution

Calculate probabilities for sampling without replacement from a finite population. Essential for quality control and lottery-type scenarios.

Last updated: March 2026

Distribution Parameters

Population Size (N)

Total items in population

Success States (K)

Items with desired property

Number of Draws (n)

Sample size (without replacement)

Observed Successes (k)

Number of successes in sample

Probability Type

Results

P(X = k)

0.133032

P(X = k)

0.133032

P(X ≤ k)

0.953975

P(X ≥ k)

0.179056

Range

0-10

Mean

2.4000

Variance

1.4890

Std Dev

1.2202

What is the Hypergeometric Distribution?

The hypergeometric distribution describes the probability of obtaining exactly k successes when drawing n items from a population of N items containing K successes, without replacement. This is distinct from the binomial distribution, which assumes replacement (infinite population).

This distribution is crucial in quality control, auditing, lottery analysis, and ecological studies. For example: drawing defective items from a lot, selecting cards from a deck, or sampling tagged animals from a population. The key feature is that probabilities change with each draw because the population shrinks and composition changes.

As the population size N increases while keeping K/N constant, the hypergeometric distribution approaches the binomial distribution, because sampling without replacement becomes equivalent to sampling with replacement.

Understanding the Parameters

Parameter Definitions

N: Total population size (all items)

K: Number of success items in population

n: Sample size (items drawn)

k: Observed successes in sample

The Probability Formula

P(X = k) = C(K,k) × C(N-K, n-k) / C(N, n)

Where C(n,k) is the binomial coefficient "n choose k"

Constraints

• K ≤ N (can't have more successes than population)

• n ≤ N (sample can't exceed population)

• k ≤ min(K, n) (successes can't exceed either K or n)

• k ≥ max(0, n - (N - K)) (can't sample fewer successes than possible)

Example Calculation

Quality Control: Defective Items in a Batch

Scenario:

• Batch contains 100 items

• 5 are known to be defective

• QA team inspects 10 items

• Question: What's the probability of finding exactly 2 defective items?

Parameters:

N = 100 (total batch)

K = 5 (defective items)

n = 10 (sample size)

k = 2 (desired successes)

Result:

P(X = 2) ≈ 0.0702 (7.02%)

About 7% chance of finding exactly 2 defective items in a sample of 10

Frequently Asked Questions

When should I use hypergeometric vs. binomial?

Use hypergeometric when sampling WITHOUT replacement from a small finite population. Use binomial when sampling WITH replacement or from a very large population where replacement is negligible.

What's the relationship to combinations?

The hypergeometric formula uses combinations (binomial coefficients). C(n,k) counts the ways to choose k items from n items. The formula divides favorable outcomes by total possible outcomes.

Why must K ≤ N?

K represents items with a property in a population of N total items. You can't have more 'success' items than total items. If K > N, the scenario is mathematically impossible.

What if I sample the entire population (n = N)?

If n = N, there's no randomness. You always get exactly K successes with probability 1. This represents a census, not a sample.

How does population size affect the distribution?

Smaller N increases variance in outcomes. Large N makes hypergeometric approach binomial. For N > 1000, you can often use binomial as an approximation with p = K/N.

What's the mean of a hypergeometric distribution?

Mean = n × K/N (sample size × proportion of successes in population). This follows the intuition: expected value equals sample size times success rate.

Can the variance ever be zero?

Yes, if k is forced to a specific value by the constraints. For example, if n = K and N - K ≥ 0, you must always get exactly k = K successes with zero variance.

How is this used in auditing?

Auditors use hypergeometric to determine error detection probability. For a population of N transactions with K known errors, sampling n items gives the probability of finding 0, 1, 2, etc. errors.

Hypergeometric Distribution Calculator