Statistics, Bias, and the Better Way to Learn: Moving Beyond the Streetlamp
- Jeff Hulett

- Oct 12
- 8 min read
Updated: Oct 13

The foundations of modern statistics—statistical moments, correlation, regression, the chi-squared test, and ANOVA—are indispensable tools powering every field from finance to pharmaceuticals. Yet, their origins are deeply unsettling. Francis Galton, Karl Pearson, and Ronald Fisher, the brilliant minds who established these methods, were devoted proponents of eugenics, a pseudoscientific movement aimed at selective breeding [Kevles, Daniel J. 1995]. Their principal motivation involved using data to quantify differences between social classes and races, explicitly seeking to prove the superiority of certain groups and the inferiority of others [Pearson, Karl. 1900].
In other words, the founders of modern statistics were hardcore racists.
It seems a profound irony: the tools of rigorous, objective analysis emerged from a motivation steeped in prejudice. Unfortunately, this profound irony became the scientific justification for some of the most horrible chapters in human history, including Nazism and Jewish extermination camps. To grasp this paradox, we begin by examining the foundation of morality, which lies in the neurobiology of learning. From there, we explore the challenge of frequentist statistics and present Bayesian inference as essential to the solution.
About the author: Jeff Hulett leads Personal Finance Reimagined, a decision-making and financial education platform. He teaches personal finance at James Madison University and provides personal finance seminars. Check out his book -- Making Choices, Making Money: Your Guide to Making Confident Financial Decisions.
Jeff is a career banker, data scientist, behavioral economist, and choice architect. Jeff has held banking and consulting leadership roles at Wells Fargo, Citibank, KPMG, and IBM.
How We Learn Is How We Discriminate
The link between a brilliant statistician following a racist premise and an everyday person displaying unconscious bias lies in the fundamental process of learning.
The human brain possesses an extraordinary superpower: the ability to recognize patterns and make rapid inferences from sparse information. This cognitive efficiency is rooted in evolution. For our ancestors, quickly categorizing a moving shadow as "predator" or "prey" was essential for survival. This mechanism requires the brain to process massive amounts of data and filter it instantly, overweighting information relevant to survival and underweighting the irrelevant. In this broader form, discrimination is how we learn. We learn to tell the difference between what can kill us and what can provide life. As such, discrimination as learning is how we grow, succeed, and make a better life.
Neuroscience confirms this learning process is fundamentally a Bayesian one: the brain starts with a prior (a pre-existing expectation or belief), then integrates new data (the likelihood) to form an updated posterior belief [Jaynes, Edwin T. 2003]. This process is efficient, but it has a dangerous side effect. The speed at which we can make life-saving decisions is a feature of our neurobiology, but it also leads to a bug, causing bias.
When applied to social groups, this same ancient mechanism drives discrimination. The brain naturally overweights information confirming tribal familiarity or safety ("people like me are safe") and underweights or ignores information challenging those deeply-held assumptions. This tendency leads to confirmation bias, where the brain seeks to confirm what it already "knows," rather than seeking truth [Kahneman, Daniel. 2011]. In other words, the statistical founder’s racist beliefs were a deep form of what we understand as confirmation bias today. They were like statistical hammers looking to drive their racist nails.
Given this neurobiological reality—a reality unknown to them—it becomes less surprising the statistical founders, steeped in the cultural priors of colonialism and racism, used their new tools to confirm their societal biases. They applied their mathematical genius to measure and validate their own prejudices, searching only under the streetlamp of available, biased data for the "keys" to racial difference.
The Premise Problem in the Age of AI
The legacy of the founders presents a two-fold challenge for modern science, especially given the current Replication Crisis [Open Science Collaboration. 2015]:
1. Fragile Frequentism
The statistical framework they established—Frequentist statistics (focused on p-values and null hypothesis testing)—assumes the systems being measured are stable and fixed. This premise works well for physics or chemistry but breaks down in the social and health sciences because human behavior is dynamic and context-sensitive [Clayton, Aubrey. 2021]. Relying on an arbitrary p<0.05 cutoff produces results which are often precisely inaccurate, leading to non-replicable findings. As a behavioral economist, I teach this concept in college seminars, emphasizing the need to approach decisions with measured judgment, not absolute certainty [Hulett, Jeff. 2024].
2. The AI Risk
In our data-abundant era, Artificial Intelligence is built on frequentist foundations. AI and large language models are extremely skilled at recognizing patterns in the data they are given (the seen). Yet, the critical challenges in human systems—the unknown unknowns—lie in the data we do not have. Training an AI on biased, incomplete, and historically frequentist data simply creates a more efficient engine for validating historical prejudice. We must stress human oversight and judgment when interpreting AI-driven recommendations, framing them as tools for empowerment, not dogma [PFR R&D].
The Solution: A Bayesian Framework for Growth
To create a more robust, honest, and adaptive science, we must shift the fundamental statistical premise toward Bayesian Inference.
The differences between the two statistical schools are rooted not just in math, but in the profoundly different priors and philosophies of their founders. Galton, Pearson, and Fisher were figures of scientific establishment, driven by massive ego and a desire to quantify their conviction of white European superiority. Their statistical framework seeks to ignore or eliminate the prior, reflecting their belief in an objective, singular truth favoring their own race.
In contrast, Reverend Thomas Bayes (an 18th-century Nonconformist Presbyterian minister) approached knowledge with a theological focus on the "happiness of His creatures" and great personal humility, never even publishing his most famous work during his lifetime [Bayes, Thomas 1763]. This service-oriented, non-establishment outlook enabled him to develop a statistical method centered on the prior: which formally acknowledges our starting assumptions must be stated honestly. The Bayesian approach, therefore, embodies a humility absent in the later Frequentists, enabling a framework allowing for the diverse priors of all people, not just a privileged few.
The Bayesian approach acknowledges what the statistical founders could not: the inherent limitations of knowledge itself. Economist Friedrich Hayek famously criticized the notion of central planners having perfect information [Hayek, Friedrich A. 1974]. Building on this, the challenge for any social or human science is dealing with the "3 Nevers" of knowledge:
Never Complete: There is always missing data, especially the "unknown unknowns." Researchers tend to focus only on the data they can see (like that found under the streetlamp as a metaphor for the "seen" data) and often dismiss the unknown data as "random noise" with no impact. This is rarely the case, as the information we lack often matters more than the information we possess.
Never Static: Human behavior, incentives, and environments change constantly. This is exacerbated by feedback loops where a policy intended to solve a problem becomes a new constraint. As measures become targets, people respond in difficult-to-anticipate ways (Goodhart's Law), ensuring today’s accurate model is tomorrow’s obsolete history.
Never Centralized: Critical information is distributed across millions of individual minds and local contexts. This is especially true in market systems, where knowledge is diversified across countless participants (traders, consumers, firms), and policymakers are often blind to much of that decentralized data. No single entity can gather it all.
This challenge, referred to as the "3 Nevers" of knowledge, transforms research from a fixed, backward-looking snapshot into a forward-looking, emergent journey. Bayesian Inference is the statistical method best equipped to manage this reality.
Priors Made Explicit: Bayesian inference forces researchers to define their initial beliefs, making hidden biases and assumptions transparent. This moves the conversation away from the false promise of "objective truth" toward honest, structured belief revision.
Continuous Updating: Each new study, whether a successful replication or a surprising null result, is treated as a signal to adjust a posterior belief. Replication failures are not a crisis; they are an opportunity for growth and a necessary correction to the model.
Resilience and Accuracy: By integrating new evidence and adapting to context, Bayesian methods are more resilient to the shifting sands of human behavior. They trade the short-term precision of a Frequentist p-value for the long-term accuracy which emerges from disciplined, iterative belief revision.
Embracing Bayesian inference means adopting a statistical framework which mirrors the adaptive nature of the brain. It is the most appropriate way to ensure AI and human learning serve to illuminate the whole room, rather than confining our search for truth to the small, familiar circle under the streetlamp. This structured approach to belief revision is the core tenet of the Making Choices, Making Money framework.
Conclusion: Humbly Adapting the Legacy
Galton, Pearson, and Fisher were indeed hardcore racists whose foundational work was motivated by a desire to prove their repulsive eugenic theories. We must state this history plainly and acknowledge the profound harm it fueled.
However, the statistical tools they created—statistical moments, correlation, regression, ANOVA—are not inherently evil. These methods are mathematically robust and remain immensely useful when applied appropriately. The frequentist statistical framework is a powerful, elegant machine, but it is one designed for a world of static physics, not dynamic human complexity.
The path forward is not to abandon their technical contribution, but to humbly embed it within a Bayesian framework. This approach recognizes the value of the Frequentist tools (the likelihood and the data) while correcting the fatal flaw of their premise. By adopting Bayesian inference, we formally admit the "3 Nevers" of knowledge, commit to continuous updating, and force the explicit declaration of our individual and societal priors. This structured decision-making process, exemplified by the Definitive Choice tool, helps individuals manage their biases, integrate evidence, and build a lifelong framework for long-term success and wealth-building strategies.
This adaptation allows us to use the analytical brilliance of the founders' tools for ethical, adaptive, and credible science. It transforms statistics from a weapon of confirmation bias into a living system of honest inquiry, moving us beyond the constraints of a rigid, outdated premise and toward a science which truly serves the ongoing, emergent journey of human knowledge.
Resources for the Curious
Clayton, Aubrey. Bernoulli’s Fallacy: Statistical Illogic and the Crisis of Modern Science. Columbia University Press, 2021. Demonstrates the historical fragility of Frequentism and its contribution to the Replication Crisis.
Hayek, Friedrich A. The Pretence of Knowledge. Nobel Memorial Lecture, 1974. Articulates the limits of centralized knowledge, a core concept behind the "3 Nevers" of data.
Jaynes, Edwin T. Probability Theory: The Logic of Science. Cambridge University Press, 2003. Provides the foundational philosophy and mathematics for the Bayesian viewpoint, framing probability as an extension of logic.
Kahneman, Daniel. Thinking, Fast and Slow. Farrar, Straus and Giroux, 2011. Details cognitive biases, including confirmation bias, which links learning to prejudiced interpretation of evidence.
Kevles, Daniel J. In the Name of Eugenics: Genetics and the Uses of Human Heredity. Harvard University Press, 1995. Provides comprehensive historical context on the eugenic motivations of statistics founders like Galton and Pearson.
Open Science Collaboration. Estimating the Reproducibility of Psychological Science. Science, 349(6251), 2015, aac4716. Presents empirical evidence illustrating the scope of the Replication Crisis in social science.
Pearson, Karl. National Life From the Standpoint of Science. A Lecture, 1900. Serves as a primary source documenting Pearson's racist and eugenic philosophical views underlying his statistical work.
Bayes, Thomas. 1763. "An Essay towards Solving a Problem in the Doctrine of Chances. By the Late Rev. Mr. Bayes, F. R. S. Communicated by Mr. Price, in a Letter to John Canton, A. M. F. R. S." Philosophical Transactions of the Royal Society of London 53: 370–418.
Hulett, Jeff. The Replication Crisis in Academia: A Premise Problem, Not a Research Problem. The Curiosity Vine, 2025. Explores the limits of frequentist assumptions in social science and advocates for Bayesian updating.
Hulett, Jeff. How We Learn is How We Discriminate. The Curiosity Vine, 2023. Connects the neurobiological process of pattern recognition to the cognitive mechanism underlying social prejudice.
Hulett, Jeff. Making Choices, Making Money: Your Guide to Making Confident Financial Decisions. PFR Publishing, 2025 (2nd Ed). Offers a practical, structured decision-making framework, emphasizing integrating personal priors and evidence.


Comments