History Resembles Itself: Hayek exposed Galton's statistics-based racism... but Galton's work lives on in today's technology companies

Jeff Hulett
Apr 10, 2024
11 min read

Updated: Apr 10, 2024

The well-intended statistical sciences have a history of divisiveness and racism. At the hands of Francis Galton, the powerful applied mathematical science was used to create eugenics as a foundation for Nazism and other state-based discrimination and genocides. Even the word "statistics" has its etymology in the role of the "state" and government usage. At its most basic, statistics is a form of discrimination. Our brain naturally learns by discriminating - like the difference between hot and cold or sweet and sour. Statistics is an approach to discriminate by interrogating a data sample as a path to that learning.

The Nobel laureate F.A. Hayek was dedicated to clarifying the role and limits of statistics, data, and knowledge, especially in policy applications.

This is especially important today, as Galton's statistical concepts are the underpinning of data science and how data science is used to discriminate platform product consumers. While state-based discrimination targeting skin color and other characteristics is on the decline, economic discrimination by powerful technology companies is on the rise.

By owning your statistics and data knowledge, there is a path to countering the relentless incentives of powerful technology companies. This is the path of the successful data explorer.

About the author: Jeff Hulett is a career banker, data scientist, behavioral economist, and choice architect. Jeff has held banking and consulting leadership roles at Wells Fargo, Citibank, KPMG, and IBM. Today, Jeff is an executive with the Definitive Companies. He teaches personal finance at James Madison University and provides personal finance seminars. Check out his new book -- Making Choices, Making Money: Your Guide to Making Confident Financial Decisions -- at jeffhulett.com.

The sordid history of statistics

The normal distribution as a natural standard. The Galton Board was invented by Sir Francis Galton (1822-1911) [i]. It demonstrates how a normal distribution arises from the combination of a large number of random events. Imagine a vertical board with pegs. A ball is dropped on the top peg. There is a 50% chance, based on gravity, that the ball will fall to the right or the left of the peg. The '50% left or right' probability occurs on every peg contacted by the ball as it falls through the board. Think of each peg as representing a simple situation where there are only 2 possible outcomes within the block, left or right. Gravity is the operative natural phenomenon captured by the Galton Board's design. After many balls are dropped, the result is a normal distribution. More balls are found central than on the outliers of the distribution.

Feel free to play with the Galton Board simulation next. Below the box, please activate the "high speed" and "histogram" boxes. Then activate the single arrow to the far left to initiate the simulation. Watch as the Galton Board works its magic. The result is a normal distribution!

Thanks to Wolfgang Christian for this wonderful digital rendering of The Galton Board.

This shows what happens when elements of nature, like atoms or molecules, act independently in a calm, natural system. The outcome often resembles a normal distribution. In the Galton Board, the 'elements of nature' are represented by the balls. The 'natural system' is represented by gravity and the pegs. At the time, this was a massive achievement of understanding, while Carl Gauss (1777-1855) and others had theorized central tendencies, Galton created practical frameworks for its application. This understanding is the basis for predictive analytics and data science used today. All data scientists have a foundation in Galton-influenced statistical mechanics.

But there was a problem - Francis Galton was a hardcore racist. In the day, racism was commonly accepted as was the belief that certain races of people were naturally superior to others. However, he used his brilliant statistical insights to create an infrastructure of human discrimination called eugenics. Among other uses, a) the Nazis used Galton’s ideas to "genetically" support their genocide and b) the United States used eugenics extensively. Many states had laws allowing "state institutions to operate on individuals to prevent the conception of what were believed to be genetically inferior children."

Author, physician,and biologist Siddhartha Mukherjee said:

“Never before in history, and never with such insidiousness, had genes been so effortlessly conflated with identity, identity with defectiveness, and defectiveness with extermination.”

The author is abhorred and saddened by Galton’s attitude, the general attitudes of the day, and the uses of eugenics. However, the author does find Galton’s statistical research and development very useful. Also, the Galton Board is a very helpful explanatory tool for the normal distribution. History is replete with people with misguided motivations, as judged by history, but that created useful tools or technologies for future generations.

Another interesting historical note, Galton is a blood relative – a younger half-cousin - of Charles Darwin. The story goes that Galton was driven by family rivalry to make his own historical name.

Hayek and scientism. As a counterpoint to Galton's misguided use of statistics, Nobel laureate F.A. Hayek coined the word “scientism” to mean:

“not only an intellectual mistake but also a moral and political problem, because it assumes that a perfected social science would be able to rationally plan social order.” [ii]

It is that scientistic assumption - "a perfected social science would be able to rationally plan social order” - that is more likely to lead to a systemically biased outcome. This was certainly the case in Galton's eugenic applications of statistics. Galton and his contemporaries assumed racial superiority and used the patina of science to justify their beliefs. This was a catastrophically sad example of groupthink and confirmation bias.

F.A. Hayek came of age in pre-World War II Europe. He was born in Austria and moved to London as Nazism grew in post-World War I Europe. The counterpoint perspective between Galton and Hayek is strong.

Hayek was a classical libertarian who believed individual choice and responsibility were essential for the efficient functioning of sovereign economies. Hayek believed there was an important place for the rule of ex-ante law.

The author's Hayekian interpretation is that law should be considered like guard rails. Society should carefully consider and only implement necessary guard rails. Then, individuals should freely choose within those guard rails. Like The 10th amendment to the U.S. Constitution, Hayek would have liked the ex-ante provision where "The powers not delegated to the United States by the Constitution" are reserved for the people.

Hayek's classical libertarian approach recognizes people will certainly make mistakes and poor choices. The Hayekian perspective is that individual choice and responsibility, while imperfect, are better than other approaches to resource allocation. Individual mistakes and choices are superior to those of a government bureaucrat making choices on the individual’s behalf. This is for three reasons:

Information power: The individual has more local information about the decision and how it will likely impact them. The bureaucrat will use summary information to make a policy decision on the average for a diverse population. Unless the individual happens to be at the average and track the average over time, the bureaucrat WILL make less accurate decisions than that available to the individual.
Error-correction power: The feedback loop effectiveness for the individual in the case of a poor decision is much greater. The power of the error-correcting incentives is far greater for the individual who must live with the decision than that of the bureaucrat deciding on behalf of the individual. While the individual wishes to avoid the pain and loss associated with a poor decision, the bureaucrat wishes to do their job at least well enough to earn their paycheck. The bureaucrat has comparatively little skin in the game.
Ownership power: An individual is likely to be highly motivated when they have a sense of ownership over their choice. Conversely, an individual will be highly de-motivated when that choice is made for them. This power - like an ‘unalienable right’ - is a foundational motivation for the American Declaration of Independence: "We hold these truths to be self-evident, that all men are created equal, that they are endowed by their Creator with certain unalienable Rights, that among these are Life, Liberty and the pursuit of Happiness."

For a deeper F.A. Hayek perspective in the context of a modern example, please see:

A Question Of Choice: Optimizing resource allocation and an HOA example

The new discrimination and the reemergence of Galton

Data and algorithms used in modern technology firms are related but are used in very different ways. Data represents our past reality. Algorithms are used to transform data. Data has already happened. An algorithm is a tool to transform data intended to predict and impact the future. Sometimes that data-transforming algorithm is helpful to you. More often today, that data-transforming algorithm is even more helpful to an organization trying to sell you something - like goods, services, or political candidates. An organization's algorithm may be helpful to you, but it often serves other purposes, including maximizing shareholder profit or filling political party coffers. But why is this?

Generally, public companies have 4 major stakeholders or "bosses to please" and you - the customer - are only one of the bosses. Those stakeholders are:

The shareholders,
The customers (YOU),
The employees, and
The communities in which they work and serve.

Company management makes trade-off decisions to please the unique needs of these stakeholder groups. In general, available capital for these stakeholders is a zero-sum game. For example, if you give an employee a raise, these are funds that could have gone to shareholder profit or one of the other stakeholders.

This means the unweighted organizational investment and attention for your customer benefit is one in four or 25%. The customer weight could certainly be below 25%, especially during earnings season. Objectively, given the competing interests and tradeoffs, this means a commercial organization's algorithms are not explicitly aligned with customer welfare. Often, the organization's misaligned algorithm behavior is obscured from view. This obscuring is often facilitated by the organization's marketing department. Why do you think Amazon's brand image is a happy smiley face :) The brand mask obscures the view of the less-aligned remaining stakeholders. [iii]

Today, computing power is both ubiquitous and inexpensive. Data stores are now among easy-to-access cloud networks. Also, many consumers are willing to trade personal data for some financial gain or entertainment. While this attitude is subject to change, this trade seems to be working for both the consumers and those companies providing the incentives. This is the 'dopamine trade' as shown in the next graphic and why people are willing to give up their data to social media platforms. The ‘dopamine trade’ relates to the reward neurotransmitter dopamine and its impact it has building addiction via reinforced synaptic activation.

The habit-forming nature of dopamine is related to many human activities including gambling, drugs, alcohol, smoking, and social media. Of this list, only social media is generally not legally controlled for children in the United States. This is why Hayekian personal responsibility is so important. While stricter laws protecting children from social media will likely occur someday, we may lose a generation to higher suicide rates and related psychological challenges because of the delay. It is personal responsibility and choice that is your greatest asset to counter the effects of the dopamine trade.

The challenge to regulate smoking and the deaths from lung cancer offers a sobering reminder. Today's technology companies and yesteryear's tobacco companies have something in common. Author and social activist Upton SInclair's message describes Big Tobacco's challenging past and what can be expected from Big Tech:

"It is difficult to get a man to understand something when his salary depends upon his not understanding it."

See: A Content Creator Investment Thesis - How Disruption, AI, and Growth Create Opportunity This article provides background for why people are willing to give up their data to the social media platforms.

Success in the data abundance age

Data abundance is the defining characteristic of today's information era. Success comes from understanding your essential data and leveraging that data with available computing technology.

Today's challenge concerns using abundant data and leveraging technology to serve human-centered decisions. Our formal math education systems have been slow to change and tend to favor former industrial era-based computation needs over information era-based data usage. [iv] This is unfortunate but only emphasizes the need to build and practice your statistical understanding even if you did not learn it in your formal education.

There is no question that technology and data have created tremendous economic value throughout society. However, that value is not evenly distributed and may cause individual harm. This is the clarion call for being an active data explorer. Let me leave you with the three habits of a successful data explorer:

Learn from our past reality,
Update our beliefs, and
Make confidence-inspired decisions for our future.

For more context, an approach, and supporting tools to being a successful data explorer and using data as a countermeasure to the big technology companies, please see:

Nurture Your Numbers: Learning the language of data is your Information Age superpower

Notes

[i] Editors, Eugenics and Scientific Racism, National Human Genome Research Institute, last updated, 2022

[ii] Editors, Hayek and the Problem of Scientific Knowledge, Liberty Fund, accessed 2024

[iii] For more context on large consumer brands and their use of algorithms please see the next article's section 5 called "Big consumer brands provide choice architecture designed for their own self-interests."

Top 6 reasons why Personal Finance success starts with choice architecture

The focus on data will help you make algorithms useful to you and identify those algorithms and organizations that are not as helpful. Understanding your data in the service of an effective decision process is the starting point for making data and algorithms useful.

While the focus is on the data, please see the next article links for more context on algorithms:

An approach to determine algorithm and organizational alignment in the Information Age:

Platform Life: How investors and consumers can survive and thrive in the platform economy

How credit and lending use color-blind algorithms but accelerate systemic bias found in the data:

Resolving Lending Bias - a proposal to improve credit decisions with more accurate credit data

[iv] The challenge of how high school math is taught in the information age is well known. The good news is that it is recognized that the traditional, industrial age-based high school "math sandwich" of algebra, geometry, trigonometry, and calculus is not as relevant as it used to be. Whereas information age-based data science and statistics have dramatically increased in relevance and necessity. The curriculum debate comes down to purpose and weight.

Purpose: If the purpose of high school is to a) prepare students for entrance to prestigious colleges requiring the math sandwich, then the math sandwich may be more relevant. If the purpose of high school is to b) provide general mathematical intuition to be successful in the information age, then the math sandwich is much less relevant. I argue the purpose of high school for students should be b, with perhaps an option to add a for a small minority of students. Also, it is not clear whether going beyond a should be taught in high school or be part of the general college education curriculum or other post-secondary curriculum. Today, the math sandwich curriculum alone lacks relevance for most high schoolers. As many educators appreciate, anything that lacks relevance will likely lead to not learning it.

Weight: Certainly, the basics of math are necessary to be successful in statistics or data science. To be successful in b) one must have a grounding in a). The reality is, high school has a fixed 8-semester time limit. Which, by the way, education entrepreneurs like Sal Khan of Khan Academy argue against tying mastery to a fixed time period. But, for now, let's assume the 'tyranny of the semester' must be obeyed. As such, the courses that are taught must be weighed within the fixed time budget. Then, the practical question is this: "If statistics and data science become required in high school, which course comes out?" I suggest the math sandwich curriculum get condensed to 4 to 5 semesters, with the information age curriculum being emphasized in 3 to 4 semesters.

The tyranny of the semester can be overcome with education platforms like Kahn Academy. Since the high school math curriculum increasingly lacks relevance, an enterprising learner or their family can take matters into their own hands. Use Kahn Academy outside of regular class to learn the data science and statistics-related classes you actually need to be successful in the information era.

Stay Curious.

History Resembles Itself: Hayek exposed Galton's statistics-based racism... but Galton's work lives on in today's technology companies

The sordid history of statistics

The new discrimination and the reemergence of Galton

Success in the data abundance age

Notes

Comments