Create a frequency polygon to visualize the distribution of grouped data as a connected line graph.
A frequency polygon is a graphical representation of a frequency distribution where class midpoints are plotted on the x-axis and frequencies on the y-axis, connected by straight line segments. Unlike histograms which use bars, frequency polygons connect points with lines, making them particularly useful for comparing multiple distributions on the same graph or showing how frequency changes smoothly across class intervals. The polygon provides a cleaner visualization that emphasizes the shape and trend of the distribution.
Frequency polygons naturally extend beyond the data range by including starting and ending points at zero frequency, creating a closed shape that sums visually to 100% of the data. This technique was developed by Karl Pearson and remains a standard tool in exploratory data analysis, especially in quality control and manufacturing processes where observing trend shapes helps identify patterns and anomalies.
The main advantage over histograms is clarity when overlaying multiple distributions, making it easier to compare shapes, central tendencies, and spread across groups.
Consider five test score classes with midpoints 35, 45, 55, 65, 75 and frequencies 5, 12, 18, 10, 5.
Polygon Points (organized left to right):
• Starting point: (25, 0) — before first class
• (35, 5) — Class 1
• (45, 12) — Class 2 (peak frequency)
• (55, 18) — Class 3 (highest frequency)
• (65, 10) — Class 4
• (75, 5) — Class 5
• Ending point: (85, 0) — after last class
Total observations = 5 + 12 + 18 + 10 + 5 = 50
Result: The frequency polygon rises from zero, peaks at midpoint 55 (frequency 18), then descends back to zero, showing a roughly symmetric distribution centered around test score 55.
How is a frequency polygon different from a histogram?
Histograms use bars; frequency polygons use lines connecting points. Polygons are easier to overlay for comparison, while histograms better show individual class frequencies.
Why extend the polygon to zero frequency at the boundaries?
This creates a closed shape representing the complete distribution and makes the area under the polygon equal to total frequency. It also clarifies where the data starts and ends.
How do I read the peak of a frequency polygon?
The highest point on the polygon indicates the modal class—the class with most observations. Multiple peaks suggest multimodal data with distinct subgroups.
Can a frequency polygon be used for cumulative frequencies?
Yes! A cumulative frequency polygon (ogive) plots cumulative frequencies. The shape rises monotonically from lower-left to upper-right. It's useful for finding medians and percentiles.
What does a symmetric polygon shape mean?
If left and right halves mirror each other, data is symmetric—mean, median, and mode are equal. Asymmetric shapes indicate skewed data.
How many classes should I use?
Use Sturges' rule: k ≈ 1 + 3.3 log(n). For 100 observations ≈ 8 classes; for 1000 observations ≈ 11 classes. Too few hide patterns, too many create noise.
Can I compare multiple distributions using polygons?
Yes! That's a key advantage. Plot multiple polygons on the same axes using different colors. Easy to spot differences in shape, center, and spread.
What if my polygon has sharp angles instead of smooth curves?
Sharp angles are normal with few classes. Use more classes to smooth the polygon. Alternatively, use curve-fitting to visualize underlying distribution.
Related Tools