📖 Guide

Statistics Fundamentals: Mean, Median, Standard Deviation

When the mean lies and the median tells the truth, what standard deviation represents, and how to read a distribution.

Ad Slot. Top Banner

The Average That Doesn't Tell the Whole Story

The U.S. Census Bureau reported median household income in 2023 at approximately $80,000. The mean (average) was higher, around $105,000. The 25,000-dollar gap exists because a small number of households earning millions pull the mean upward without affecting the median. A politician citing the mean sounds like Americans earn more; one citing the median tells a different story. Both numbers are calculated correctly. The choice between them is not neutral.

Understanding mean, median, mode, and standard deviation gives you the vocabulary to read data reports without being misled. These four numbers summarize almost every quantitative dataset you'll encounter, from test score reports to salary surveys to clinical trial results. They also lay the foundation for probability: the normal distribution, which governs everything from human heights to manufacturing tolerances, is defined entirely by its mean and standard deviation.

This guide calculates each measure from scratch with small datasets, explains why standard deviation divides by n-1 instead of n for samples, and shows when each measure tells the truth and when it misleads.

Mean, Median, and Mode: Three Ways to Describe Center

The mean (arithmetic average) sums all values and divides by the count. For the dataset [4, 7, 7, 9, 13]: sum = 40, count = 5, mean = 40/5 = 8. The mean uses every data point. One extreme value shifts it significantly.

The median is the middle value when data is sorted in order. For [4, 7, 7, 9, 13] (already sorted), the middle value is the 3rd element: 7. When the dataset has an even count, [4, 7, 9, 13], the median is the average of the two middle values: (7+9)/2 = 8. The median ignores how extreme the highest and lowest values are. Replace 13 with 1300 and the median stays 7; the mean jumps to 265.4.

The mode is the value that appears most often. In [4, 7, 7, 9, 13], the mode is 7 (appears twice). A dataset can have no mode (all values unique), one mode (unimodal), or multiple modes (bimodal or multimodal). Shoe size, clothing size, and survey response categories often use mode because a "most frequent" answer is more actionable than an average.

Standard Deviation: The Math of Spread

Standard deviation measures how far values spread from the mean. A low standard deviation means values cluster tightly around the mean. A high standard deviation means they scatter widely.

Dataset: [4, 7, 7, 9, 13], mean = 8.

Step 1: Find each value's deviation from the mean: (4-8)=-4, (7-8)=-1, (7-8)=-1, (9-8)=1, (13-8)=5.

Step 2: Square each deviation: 16, 1, 1, 1, 25.

Step 3 (population): Average the squared deviations: (16+1+1+1+25)/5 = 44/5 = 8.8. This is the population variance.

Step 3 (sample): Divide by n-1 instead of n: 44/4 = 11. This is the sample variance. Use the sample formula whenever your data is a sample from a larger population, which is almost always the case in research.

Step 4: Take the square root. Population standard deviation: √8.8 ≈ 2.97. Sample standard deviation: √11 ≈ 3.32.

The reason for n-1 (Bessel's correction): a sample's values cluster closer to the sample mean than to the true population mean. Dividing by n underestimates the true population variance. Dividing by n-1 corrects for this bias. With large samples (n > 30), the difference becomes negligible.

Ad Slot — In-Content

The Normal Distribution and the 68-95-99.7 Rule

Many natural phenomena, human heights, IQ scores, measurement errors, follow a normal distribution: a symmetric bell curve centered at the mean. Once you know the mean (μ) and standard deviation (σ), the 68-95-99.7 rule tells you where data falls:

  • 68% of values fall within 1σ of the mean (between μ-σ and μ+σ).
  • 95% fall within 2σ.
  • 99.7% fall within 3σ.

Adult male height in the US has mean ≈ 70 inches and standard deviation ≈ 3 inches. So 68% of men stand between 67 and 73 inches. 95% stand between 64 and 76 inches. A man at 79 inches (6'7") sits more than 3σ above the mean, in the top 0.15% by height.

A z-score converts any value to standard deviations from the mean: z = (x - μ) / σ. For x = 76 inches: z = (76-70)/3 = 2. A z-score of 2 means the value is 2 standard deviations above the mean, in the top 2.28% of the distribution.

Common Misconceptions