Measure information content and uncertainty in data using Shannon's information theory.
Last updated: March 2026
Shannon entropy is a measure of the average information content or uncertainty in data. Introduced by Claude Shannon in 1948, it quantifies how much randomness or diversity is present—high entropy means high uncertainty (many equally likely outcomes), low entropy means predictability (few likely outcomes). Entropy is fundamental to information theory, data compression, cryptography, and machine learning.
Entropically, a fair coin flip (50/50 heads/tails) has maximum entropy compared to a biased coin (90/10). A uniform distribution of symbols has higher entropy than one where symbols are unevenly distributed. Entropy is measured in different units depending on the logarithm base: bits (base 2), nats (base e), or hartleys (base 10).
The formula H = −Σ(pᵢ × log(pᵢ)) calculates average information. Redundancy measures how far actual entropy is from maximum entropy—high redundancy means the data is predictable and compressible.
Text Analysis: Which is More Unpredictable?
Entropy is always non-negative (≥ 0). A negative result indicates a calculation error. Entropy = 0 only when one symbol occurs with certainty.
Base 2 (bits) is common in computer science. Base e (nats) appears in physics. Base 10 (hartleys) in telecommunications. They're just different units—multiply by log(b)/log(b') to convert.
Entropy measures actual information content. Redundancy measures unused potential—how much less random the data is than maximum. Low redundancy (near 1) means data is nearly random.
Data compression (ZIP uses entropy principles), cryptography (strong encryption maximizes entropy), machine learning (information gain measures feature importance), quality control (detecting process changes).
No, entropy is always ≤ log₂(n) where n is unique symbols. If you get H > max H, check your calculation. Maximum entropy occurs with uniform probability distribution.
It depends! For cryptographic keys, higher entropy is better (harder to crack). For data compression, lower entropy is better (more compressible). Context matters.
Related Tools
Effect size measure.
Species diversity.
Categorical variability.
Intraclass correlation.
Smoothed trend line.
Percentage change.