How Not to Be Fooled by Statistics

Jeff Hulett
12 hours ago
7 min read

Updated: a few seconds ago

Modern life is surrounded by data and statistics. From scientific breakthroughs and medical diagnoses to policy decisions and everyday purchases on consumer marketplace platforms. Yet, for all their power, statistics prove profoundly misleading if we do not understand the fundamental assumptions which underpin them. To truly navigate this data-rich landscape, one needs a holistic approach to inference, one thoughtfully considering the full context of beliefs and the world around us. This approach finds its best expression in Bayesian statistics, which provides a richer framework for interpretation rather than discarding traditional methods.

At its core, statistical inference involves making informed conclusions from data. Two dominant philosophies guide this process: frequentist and Bayesian. Understanding their differences is essential to avoiding common pitfalls and making robust judgments.

The Frequentist Lens: Focusing on the Evidence Itself

Imagine testing a new drug. A frequentist approach typically asks: "How likely is it one would observe the results just seen (the new evidence or new data), given the drug has no effect (the existing belief, or 'null hypothesis')?" This method centers on the likelihood of the evidence given an assumed belief. One calculates a "p-value," which details the probability of observing data as extreme as, or more extreme than, the observed findings, if the null hypothesis were true. If this p-value is very low (e.g., less than 0.05), one rejects the null hypothesis, concluding the drug likely does have an effect.

This approach is powerful and drives countless scientific discoveries. It excels at establishing whether observed data deviates significantly from a baseline assumption. It is often precise and forms the backbone of many randomized controlled trials. However, the frequentist method, by its very nature, tends to focus solely on the likelihood of the data under a specific hypothesis, often side-stepping two crucial elements: prior beliefs about the world and the broader context in which the evidence exists. When these elements are ignored, frequentist inferences will often provide the illusion of precision and be wildly inaccurate.

The Historical Omission: When Statistics Ignored Context

The omission of prior beliefs and the broader environment is not a historical accident; it aligns directly with troubling motivations in the early development of frequentist methods.

The tools of modern statistics were formalized by figures such as Francis Galton, Karl Pearson, and R.A. Fisher. These individuals were also founders and prominent proponents of the eugenics movement. Their prime motivation was to create "objective" quantitative measures to support pseudoscientific agendas—specifically, to quantify and "prove" racial and social hierarchies, often focusing on establishing the supposed superiority of white European populations.

The absence of the prior in frequentist calculation served this purpose perfectly. To a strict frequentist and eugenicist, the given null hypothesis is that all people are the same. The approach to using the p-value without priors or environmental context made it relatively easy and grossly inaccurate to reject the null hypothesis to infer that white Europeans are the superior race. Say someone's family of origin and culture is one of the white European. This becomes the default prior and the environment of the test. When comparing someone who was not brought up with the same priors and environmental conditions, it would be difficult to construct a test where the white European is NOT seen as superior. The challenge being, all that p-value actually determined was that being a white European is the most important effect to being a white European!

By creating a framework explicitly excluding the prior, statistical founders allowed studies to focus only on the data's likelihood, making it easier to publish "statistically significant" (but contextually absurd) conclusions supporting racist and pseudoscientific claims. This historical context underscores the ethical and practical necessity of employing a framework demanding full contextual consideration.

The Bayesian Lens: Updating Our Beliefs with New Evidence

The Bayesian perspective asks the fundamental question: "How should one update one's existing belief about the drug's effect, given this new evidence just gathered?" Notice, the Bayesian drug effects question is the reverse of the frequentist question asked earlier. In other words, a frequentist broadly asks:

"What is the probability of this new evidence given my existing belief?"

Whereas the Bayesian goal is to determine a posterior probability by asking:

"What is the probability of my existing belief given the new evidence?"

The Bayesian framework explicitly integrates three key components:

The Prior (Existing Belief): This formalizes what one believed before seeing the new data. It acknowledges one rarely starts with a blank slate.
The Likelihood (The New Evidence): This remains the objective engine of inference. It details how probable the observed data would be if various beliefs about the world were true.
The Marginal Likelihood (The Environment/Context): This vital component is the baseline probability of the evidence itself, averaged across all possible beliefs. It acts as a normalizing factor, ensuring the updated belief is coherent within the broader context of known data behavior. It is the check against sensational anecdotes.

By combining these elements, Bayesian inference provides a posterior probability—a revised, updated belief incorporating initial perspective (the priors) and the weight of new evidence, all contextualized by the environment.

Frequentist statistics vs. Bayesian inference

How frequentist methods are more likely to favor precision, lack accuracy, and lead to a replication failure.

How Not to Be Fooled by Statistics - comparing frequentist and Bayesian inference

How to interpret this diagram: The “X’s” represent data points, while the red target center is the goal. The collection of X's on each target represents how the data was transformed by the model or research. Accuracy reflects closeness to the target; precision reflects closeness to each other. The darkness of the X's reflects the degree to which the model was validated over time, from T=0 to T=1. In the upper-left example, the points are tightly clustered (precise) but far from the goal (inaccurate), often the result of limited or biased data. The comparison illustrates how frequentist approaches may yield precise but biased (inaccurate) results, whereas Bayesian inference, though less precise initially, converges on greater precision and accuracy over time.

A Holistic Inference for Sound Decisions

The Bayesian approach’s beauty lies in its explicit integration of all available information and, most importantly, empowers one to update their inferences (or beliefs) as one learns. It forces one to articulate initial assumptions and ground conclusions within the overall probability space. This results in a more robust and holistic inference. In my data science experience, Bayesian and frequentist methods often require trade-offs. A wise statistician will sometimes reduce the power of their predictive results in order to achieve a more accurate and long-term stable inference. Examples of reduced power include a p-value, R-squared, K-S statistic, etc, indicating lower significance. Unfortunately, this lower significance statistical wisdom is actively discouraged by academic research incentives rewarding lower p-values.

Frequentist methods remain incredibly useful for hypothesis testing and calculating the surprise of data under a null hypothesis. They provide the likelihood, a key objective component necessary for any inference.

However, to truly avoid being fooled by statistics, one should strive to understand frequentist results within the broader, more holistic Bayesian framework. When a frequentist p-value suggests a significant finding, a Bayesian thinker asks: "Given this likelihood, but then overlaying our prior belief and the broader environment in which the phenomenon exists, how strongly should I now believe this effect is real?"

By recognizing frequentist statistics as a valuable component within the comprehensive Bayesian paradigm, one gains a deeper understanding of the world. The Bayesian approach empowers one to move beyond simply observing data to genuinely learning from it, ensuring inferences are not just statistically sound but also holistically wise.

Resources for the Curious

I. The Curiosity Vine Articles (Bayesian Inference and Replication Crisis)

These articles by Jeff Hulett directly address the core themes of your piece, advocating for the Bayesian framework as a solution to modern statistical challenges.

Hulett, Jeff. "Embrace the Power of Changing Your Mind: Think Like a Bayesian to Make Better Decisions, Part 1." The Curiosity Vine, 2024.
Hulett, Jeff. "Statistics, Bias, and the Better Way to Learn: Moving Beyond the Streetlamp." The Curiosity Vine, 2025.
Hulett, Jeff. "The Replication Crisis in Academia: A Premise Problem, Not a Research Problem." The Curiosity Vine, 2025.

II. Historical and Ethical Critique of Frequentist Foundations

These resources provide academic support for the historical context connecting key frequentist founders (Galton, Pearson, Fisher) to the eugenics movement and the ethical implications of their work.

Clayton, Aubrey. "How Eugenics Shaped Statistics." Nautilus Magazine, 2020.
- Note: Excellent overview of Galton, Pearson, and Fisher's eugenic beliefs and their impact on the field’s development.
Pearson, Karl. National Life From the Standpoint of Science: A Lecture. Adam and Charles Black, 1901.
- Note: A primary source documenting Pearson's racist and eugenic philosophical views underlying his statistical work, supporting the historical argument.

III. The Replication Crisis and Statistical Reform

These academic sources ground the discussion of the Replication Crisis and directly contrast the frequentist p-value approach with Bayesian solutions.

Open Science Collaboration. "Estimating the reproducibility of psychological science." Science, Vol. 349, Issue 6251 (2015).
- Note: The landmark paper that formally documented the low replication rate in psychology, central to the discussion of the crisis.
Colling, Lincoln J. and Szucs, Denes. "Statistical inference and the replication crisis." Review of Philosophy and Psychology, Vol. 10, Issue 4 (2019).
- Note: A paper which directly examines the issues within Frequentist statistics that may have led to the crisis and explores Bayesian statistics as the alternative.

IV. Classic and Modern Bayesian Advocacy

These books and articles champion the Bayesian perspective as a superior, more flexible methodology for scientific inference and data analysis.

McElreath, Richard. Statistical Rethinking: A Bayesian Course with Examples in R and Stan. 2nd Edition. CRC Press, 2020.
- Note: A highly popular, accessible, and intuitive book which advocates for the Bayesian approach and focuses heavily on the use of priors.
Jaynes, Edwin T. Probability Theory: The Logic of Science. Cambridge University Press, 2003.
- Note: A foundational text that treats probability as an extension of logic, providing a powerful philosophical defense of the Bayesian view over the frequentist approach.
Kruschke, John. Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan. 2nd Edition. Academic Press, 2015.
- Note: A practical and introductory textbook often praised for clearly articulating the differences between the Bayesian and frequentist paradigms.