Content area
Full Text
Ignorance of how sample size affects statistical variation has created havoc for nearly a millennium
What constitutes a dangerous equation? There are two obvious interpretations: Some equations are dangerous it you know them, and others are dangerous if you do not. The first category may pose danger because the secrets within its bounds open doors behind which lies terrible peril. The obvious winner in this is Einstein's iconic equation e = mc^sup 2^, for it provides a measure of the enormous energy hidden within ordinary matter. Its destructive capability was recognized by Leo Szilard, who then instigated the sequence of events that culminated in the construction of atomic bombs.
Supporting ignorance- is not, however, the direction I wish to pursue-indeed it is quite the antithesis of my message. Instead I am interested in equations that unleash their danger not when we know about them, but rather when we do not. Kept close at hand, these equations allow us to understand things clearly, but their absence leaves us dangerously ignorant.
There are many plausible candidates, and I have identified three prime examples: Kelley's equation, which indicates that the truth is estimated best when its observed value is regressed toward the mean of the group that it came from; the standard linear regression equation; and the equation that provides us with the standard deviation of the sampling distribution of the mean-what might be called de Moivre's equation:
σ^sub x^ = σ/[the square root of]n
where σ^sub x^ is the standard error of the mean, σ is the standard deviation of the sample and n is the size of the sample. (Note the square root symbol, which will be a key to at least one of the misunderstandings of variation.) De Moivre's equation was derived by the French mathematician Abraham de Moivre, who described it in his 1730 exploration of the binomial distribution, Miscellanea Analytica.
Ignorance of Kelley's equation has proved to be very dangerous indeed, especially to economists who have interpreted regression toward the mean as having economic causes rather than merely reflecting the uncertainty of prediction. Horace secrist's The Triumph of Mediocrity in Business is but one example listed in the bibliography. Other examples of failure to understand Kelley's equation exist in the sports world,...