(news.mit.edu) Explained: Sigma | MIT News | Massachusetts Institute of Technology

ROAM_REFS: https://news.mit.edu/2012/explained-sigma-0209

How do you know when a new finding is significant? The sigma value can tell you — but watch out for dead fish.

David L. Chandler, MIT News Office

Publication Date: February 9, 2012

It's a question that arises with virtually every major new finding in science or medicine: What makes a result reliable enough to be taken seriously? The answer has to do with statistical significance — but also with judgments about what standards make sense in a given situation.

The unit of measurement usually given when talking about statistical significance is the standard deviation, expressed with the lowercase Greek letter sigma (σ). The term refers to the amount of variability in a given set of data: whether the data points are all clustered together, or very spread out.

In many situations, the results of an experiment follow what is called a “normal distribution.” For example, if you flip a coin 100 times and count how many times it comes up heads, the average result will be 50. But if you do this test 100 times, most of the results will be close to 50, but not exactly. You'll get almost as many cases with 49, or 51. You'll get quite a few 45s or 55s, but almost no 20s or 80s. If you plot your 100 tests on a graph, you'll get a well-known shape called a bell curve that's highest in the middle and tapers off on either side. That is a normal distribution.

The deviation is how far a given data point is from the average. In the coin example, a result of 47 has a deviation of three from the average (or “mean”) value of 50. The standard deviation is just the square root of the average of all the squared deviations. One standard deviation, or one sigma, plotted above or below the average value on that normal distribution curve, would define a region that includes 68 percent of all the data points. Two sigmas above or below would include about 95 percent of the data, and three sigmas would include 99.7 percent.

So, when is a particular data point — or research result — considered significant? The standard deviation can provide a yardstick: If a data point is a few standard deviations away from the model being tested, this is strong evidence that the data point is not consistent with that model. However, how to use this yardstick depends on the situation. John Tsitsiklis, the Clarence J. Lebel Professor of Electrical Engineering at MIT, who teaches the course Fundamentals of Probability, says, “Statistics is an art, with a lot of room for creativity and mistakes.” Part of the art comes down to deciding what measures make sense for a given setting.

For example, if you're taking a poll on how people plan to vote in an election, the accepted convention is that two standard deviations above or below the average, which gives a 95 percent confidence level, is reasonable. That two-sigma interval is what pollsters mean when they state the “margin of sampling error,” such as 3 percent, in their findings.

That means if you asked an entire population a survey question and got a certain answer, and then asked the same question to a random group of 1,000 people, there is a 95 percent chance that the second group's results would fall within two-sigma from the first result. If a poll found that 55 percent of the entire population favors candidate A, then 95 percent of the time, a second poll's result would be somewhere between 52 and 58 percent.

Of course, that also means that 5 percent of the time, the result would be outside the two-sigma range. That much uncertainty is fine for an opinion poll, but maybe not for the result of a crucial experiment challenging scientists' understanding of an important phenomenon — such as last fall's announcement of a possible detection of neutrinos moving faster than the speed of light in an experiment at the European Center for Nuclear Research, known as CERN.

Local Graph

org-roam 688cf219-e0b2-461a-836d-eeb3bf7704d9 (news.mit.edu) Explained: Sigma | MIT... website news:website 688cf219-e0b2-461a-836d-eeb3bf7704d9->website