Peeved About P Values

p-value

When a psychology graduate student at the University of Virginia completed a statistical research project about whether politically moderate people were better at perceiving "shades of gray" than right or left-wing extremists, he discovered that the P value of his results was 0.01. In statistics lingo, a P value of 0.01 is considered "very significant."When statistical research concludes with a significant P value, this means the study is solidly valid and should be able to be replicated with the same results. To ensure the study could be replicated, the grad student and his advisor decided to repeat the study using additional data. However, the P value of the second study differed from the original P value--0.59--which eliminated the significance of the results and, according to the article about this grad's student almost seminal study: "the effect had disappeared--along with Motyl's (the grad student) dreams of youthful fame."The Problem with P ValuesScientists have always assumed the P value of any statistical analyses was an objective and reliable summation of the study. According to Roosevelt University economist and critic of conventional statistical methods Stephen Ziliak: "P values are not doing their job because they can't."Why P Value Have Few FriendsDeveloped nearly 90 years ago by English statistician and geneticist Sir Rondal Fisher, P values continue to be used begrudgingly and with much criticism. What's ironic is that Sir Ronald never meant a P value to represent a definitive decision - he simply designed the P value to be significant in a way that obliged another look at the data being tested.What Fisher wanted to accomplish with his P value was to determine if results would be comparable to results produced by random chance. Initially, researchers would establish a "null" hypothesis, or the hypothesis they wished to disprove. Then, they would assume the hypothesis was true and measure the chances of receiving results that were at least equivalent to what was observed. Fisher deemed this the P value and asserted that the smaller a P value was, the greater the probability that the "straw-man" assumption of the null hypothesis was incorrect.When Fisher's P value became popularized by scientists who wanted analytical decision-making that was evidence-based to be objective and rigorous, his mathematic/statistic rivals went to work developing their own data analysis framework that included false negatives, false positives and other concepts taught in all basic statistic classes. However, they deliberately omitted P values.Ultimately, statistics textbooks written mostly by non-statisticians combined both frameworks to produce a hybrid system of Fisher's relatively simple-to-calculate P value with his rivals' more rigorous system. Consequently, any P value of 0.05 now represents a statistically significant result - not exactly what Fisher meant it to represent.Does a P Value Really Mean Anything?With a P value of 0.01, Motyl's political extremist study should indicate that there is a one percent chance that the results are false. However, we now understand that a P value really can't determine this with certainty. It can only summarize information that assumes a certain null hypothesis. In other words, P values aren't capable of determining valid statements concerning a rudimentary reality. Additional information would be needed to determine these statements, such as the odds that real influences had already been established.An epidemiologist working for Stanford University in 2005 called John Ioannidis went even further in denouncing P values as reliable by suggesting that the majority of published findings are, in fact, false. Since uttering this astounding observation, many notorious replication issues have compelled research scientists to reconsider how results of statistical studies are evaluated.Whether your research uses P values or not - your research framework should always include validation of your findings.

Make smarter decisions faster with the world's #1 Insight Management System.