Good decision-making and financial services: The surprising impact of bias and noise

Jeff Hulett
May 19, 2021
15 min read

Updated: Jan 24, 2022

This article helps you make good decisions. We will define the difference between accuracy and precision. Plus, how they relate to bias and noise. We will provide financial services industry examples and a compliance case study. Finally, we provided three actionable solutions to help you decrease noise, increase precision, and dramatically improve your decision quality.

This article is presented with the following sections:

Background - the nuances of accuracy and precision + bias and noise
Noise and Bias examples: An application to Financial Services
A Compliance Case Study - How to improve your CRA
1. CRA background
2. CRA infrastructure foundation
3. How a quantitative CRA reduces noise
High-quality decision-making solutions
1. Decision solution 1 - The Noise Audit
2. Decision solution 2 - Quantification of the CRA decision
3. Decision solution 3 - Cognitive Risk and Controls
Conclusion and Notes

Background - the nuances of accuracy and precision + bias and noise

Scientists aim for the highest degree of precision and accuracy. This is another way of saying that scientists maximize explanatory power by minimizing error. There are times when they make rough estimates for making a scientific decision, like an inference. Their study may have uncertainty and need critical information. Minimizing study uncertainty is a key to success. To express uncertainty, they use these 2 values: accuracy and precision.

Business people face similar challenges, especially as it relates to potential noise and bias in their operating processes. This is particularly true as Artificial Intelligence and Robotics become more commonplace. For example, algorithms may have embedded biases that are hard to detect. It is recognized, the use of Artificial Intelligence, if not governed correctly, can institutionalize bias at high volume and high impact. Please see our Risk Management Association Journal article Statistics and AI-Enhanced Automation in Banking Transaction Testing. The article explores analytically focused organizations that perform statistical and automation-based testing and analytics.

So, whether science or business, our understanding of how we make decisions is critical. To proceed down a good decision-making path, let’s start with the nuanced definition of Accuracy and Precision.

Accuracy is defined as how close a measurement is to a standard or accepted value. This is impacted by the amount of bias in the process.

Precision is how close measurements, or the closeness of the measurements/results with each other. This is impacted by the amount of noise in the process.

It is possible to be very accurate without being precise and to be very precise but not very accurate. We can suffer both accuracy and precision problems.

To further explain, we can use a dartboard example to help distinguish between precision and accuracy. Let’s use the bulls-eye of the dartboard as the standard (true value). The closer the darts land to the bulls-eye the more accurate. If all the darts land in one area, anywhere in the dartboard, and very close to each other, that is precision. Also, we use a distinction between process and outcome. Generally:

The amount of noise in a process maps to a similarly precise outcome (e.g., a high noise process leads to a low precision outcome).
The amount of bias in a process maps to a similarly accurate outcome. (e.g., a high bias process leads to a low accuracy outcome).

Also, in our dartboard example: 1) each team has 5 unique dart throwers, throwing 1 dart each, and 2) the "X" represents where the dart lands.

For Team A, all the darts landed close to the bulls-eye and they are very close to each other. Therefore the outcome is both precise and accurate. Also, the outcome was the result of an unbiased, low noise process

For Team B, the darts all land very close to each other, but far away from the bulls-eye. Therefore, it shows there is precision but is not much accuracy. Team B demonstrates a biased process because it is systemically off target.

For Team C the darts are not very close to the bulls-eye but are equally spaced around the bulls-eye. This shows that there is accuracy but not much precision. Team C demonstrates a noisy process because of the low precision outcome.

For Team D all the darts are widely scattered but to the left of the bulls-eye. Like Team B, they lack accuracy. Also, like C, they lack precision. This results from both a noisy and biased process.

Please note, accuracy is based on distance from a standard. Standards are generally a human construct and can change. That is, what is accurate today may not be accurate tomorrow. Also, accuracy and bias are related. Since some standard is necessary to determine accuracy, it follows that the same standard could be related to a biased outcome. As such, bias is a function of the standard setter. The standard itself will drive the perception of bias. A related logical fallacy is called "Moving the goal post." This happens when the parameters for making an argument are changed after a conclusion is reached. As such, it is important to inspect both the standard and the measure when determining accuracy.

Daniel Kahneman’s book Noise uses the metaphor for "flipping the target." (see the graphic, this is the same as the prior dartboard graphic, except the bulls-eyes have been removed.) The metaphor suggests bias (accuracy) requires an understanding of the standard (location of the bullseye) whereas noise (precision) does not. Precision only requires understanding the relative distance of systems outcomes (dart cluster). When you flip the target, one can see the relative distance of the outcome, but one cannot see the distance from the standard. In other words, you know that Teams D and C are not very precise as a result of a noisy process. Also, you know Teams B and A are very precise as a result of a low noise process. However, there is no way to determine bias or the resulting accuracy without the bulls-eye standard. This difference becomes important in the Compliance Risk Assessment case study later in the article.

Noise and Bias: Financial Services examples

Bias tends to be easier to see. In other words, you can see potential bias in individual examples. You do not need to take a statistical view to perceive bias. If a loan applicant is a member of a protected class (e.g., race, gender, age, sexual orientation, etc.) and they are declined for a loan, that is potential bias. It will actually take statistical analysis to demonstrate this example is not a biased judgment.

Also, generally in loan decision-making, bias is legally discouraged. For example, bias in mortgage lending is already regulated by a variety of laws and regulations, including Fair Lending, the Home Mortgage Disclosure Act (HMDA), the Equal Credit Opportunity Act (ECOA), and the Fair Credit Reporting Act (FCRA). These laws regulate loan-related bias via an outcome-based approach. That is, HMDA testing is required to ensure certain lending biases against defined protected classes are monitored and, if detected, are corrected. For example, if a loan underwriter makes a biased judgmental decision involving a protected class, the outcome should be detected via HMDA testing and remediated. Also, under FCRA and ECOA, a lender is required to give a declined loan applicant an explanation as to why a loan was declined. (aka, the adverse action notice). Based on these lending-based examples, decision-making noise may be more significant than bias in the financial services context.

Noise tends to be harder to see. In other words, noise can generally only be confirmed with an "outside-in" statistical analysis. It is very hard to perceive noise in individual “inside-out” examples. (1)

As a real-world example, in a new paper for Royal Society Open Finance, “Quantifying the cost of decision fatigue: sub-optimal risk decisions in finance”, Tobias Baer and Simone Schnall examine the credit decisions of loan officers at a leading bank over the course of their working day. The academics write that decision fatigue “typically involves a tendency to revert to the ‘default’ option, namely whatever choice involves relatively little mental effort”. In other words, as decision-makers tires, they are less likely to stray from the mentally less challenging decision. So the time of day matters. This is a prime example of noise that may ultimately cause the decision to be less precise. Other examples of time of day noise, in a much-sited study of Israeli judges, it was found they were less likely to grant parole as lunch approached, but more lenient once their stomachs are full.

To put a finer point on noise, there are distinct kinds of noise. The prior example is known as occasion noise, where the current external circumstances create noise. But what about two underwriters with different life experiences and training? This is both internal to the underwriter and stable to the moment. This is known as stable pattern noise. In the underwriting context, stable pattern noise is managed by having more experienced underwriters handle more complex cases. But even within case segments, we can expect underwriters to exhibit stable pattern noise. This occurs as life experiences and personality traits may create different judgments even with those with similar training and professional experiences. This may be checked with a quality control challenge process, with independent ”checkers” challenging the “makers.” But even with QC, professionals are generally provided decision making latitude that is the breeding ground for noise. (2)

CRA Case Study

To dig deeper into bias and noise, the following is a more in-depth Compliance Risk Assessment (CRA) case study. CRA, in the financial services context, is a process that provides a risk rating along multiple risk dimensions. This may include Inherent Risk, Control Effectiveness, and Residual Risk.

Inherent Risk - stems from the nature of the business transaction or operation without the implementation of internal controls to mitigate the risk.
Control Effectiveness - is the quality of the internal controls to mitigate inherent risk or the impact of an existing internal control that malfunctions.
Residual Risk - is the remaining risk after Inherent Risk and Control Effectiveness are taken into account.

In my experience, CRAs are performed judgmentally by professional compliance officers, specializing in particular financial services products and related regulations. (e.g., Commercial v Consumer or Lending v Deposits v Investments) Today, there is a trend to provide a structured quantitative CRA approach and allowing for qualitative overlays, if necessary. This is reminiscent of the long-standing Allowance for Credit Losses (ACL) approach used for credit risk. I am very supportive of quantifying CRA. Consistent with the theme of this article, it will increase both accuracy and precision by reducing potential noise and bias in the CRA process. An added business benefit is that it should reduce costs by increasing the number of assessments a compliance professional may carry.

While bias is certainly possible in CRA's, it is likely most of the error will be captured within the noise. Keep in mind, noise and bias are independent but may have common causes. For example, an assessor‘s level that led to level noise may also cause bias. Significantly, though, noise will always be manifest first. This is because judgmental outcomes susceptible to bias are often governed by oversight entities (like the regulators or internal auditors.) Their assessments are infrequent, are not always comprehensive, plus the standards could be dynamic. Whereas noise can be detected almost immediately. As such, managing CRA success should be primarily focused on noise reduction.

CRA infrastructure foundation

Building your foundation is critical to creating high precision, low noise, quantitative-enabled CRA environment. Four foundational items include:

Assessment Entity Taxonomy - a capability to identify and organize your system-wide assessment entities;
Obligation Inventory Management - includes a process to regularly rate and update obligations, specific to the obligation sources. These are generally related to Inherent Risk;
Assessment Entity Mapping - this is the mapping that builds on the taxonomy and enables obligations and issues to be mapped to the individual entities; and,
Issues and control inventory - A process to rate and maintain issues, specific to the source. Also, will inventory controls and key descriptive elements. These are generally related to Control Effectiveness.

Naturally, the technology environment will be important. There are a few key tools for your technology stack.

CRA database to house structured data. This will be used for data to support both the quantitative ratings and the descriptive information to inform the qualitative assessment.
CRA workflow to manage CRA process. This will be integrated into key data sources, to provide a structured workflow, queuing, and reminders.
CRA cognitive tools - This includes tools to help evaluate control descriptions and evaluate the control quality. This may also include unstructured data ingestion capabilities to handle structuring data from documents or other unstructured sources required for the CRA

How a quantitative CRA reduces noise

First, let's define CRA-related noise. Noise is decision variability. A variable decision is not necessarily right or wrong, since the performance standard is not available. But it is the variability that creates the total error.

Level noise - There are some CRA assessors tougher than others. Some may be more business-friendly. The point is, an assessor's assessment posture is a form of noise and will drive variability error based on the various assessors' level category and across the assessment entities.
Stable pattern noise - This is related to the unique personality, country of origin, industry experience, training, and other unique traits. This may drive contrasts in how different assessment entities are considered. For example, if assessment entities are assigned at the country level, the assessors' country of origin will make a difference as to country understanding. The amount of training and experience will impact the perception of risks and control effectiveness. An assessor’s sensitivity to challenge from the assessment entity may create noise. An assessor's stable pattern noise may impact certain assessment entities more than others.
Occasion noise - This is related to unique occasions, generally at the time an assessor is performing assessments. For example, assessments made late in the day, when energy levels are lower may be different than those made earlier in the day. (This was introduced earlier in the “decision fatigue” example) The assessment's made when the assessor is hungry may be different than when they are full. Assessments made at the end of a busy reporting period may be different than those made at a less busy time.

Noise is cumulatively additive. It does not ”average out.” By reducing the amount of human judgment in a decision, by definition, potential noise (3) will be reduced. This chart demonstrates the reduction in potential noise. You may read the chart by interpreting any variance from the 0 horizontal axis as potential noise. For example, for the yellow decision segment (75% Judgmental, 25% Quantitative) the first 25% information value utilized for the assessment has 0 noise because it is based on quantitative judgment. The remaining 75% information value has increasing noise as judgmental information is derived and evaluated. A 100% judgmental decision process will have 4.5x as much potential noise as a 75% judgmental process. A 100% judgmental decision process will have 24x as much potential noise as a 50% judgmental process. The 2 takeaways are:

Potential noise reduction is exponentially related to reduced human judgment.
Significant noise reduction can be made by small swaps of quantitative judgment for human judgment.

The moral of the story - do not get too hung up on getting 100% of the information value captured in quantitative measures. Start down the path of a quantitative CRA. Even a small CRA quantitative implementation can lead to significant increases in CRA precision by decreasing potential noise. Also, the measures may not be as accurate as desired. The data inputting quantitative CRA ratings may require cleansing. As such, the data will improve over time. All the while, precision is enhanced via the quantitative CRA.

Another way to look at it - Decision Anchoring (4)

A predefined quantitative CRA will reduce judgmental noise by giving the assessors a consistent anchoring starting point. Many studies have shown people are more comfortable with making a decision as a change from a credible anchor. If the quantitative anchor is not provided, people will naturally anchor from noise (like their pattern) and from the last CRA.

High-quality decision-making solutions

At this point, hopefully, we have convinced you:

The importance of understanding the nuanced differences between accuracy and precision; and,
The need to focus on noise as a practical and well-supported starting point for improving decision quality

Next, we will jump into the solutions. We will continue using the CRA as our solution implementation example. I am sure you appreciate, this example could be applied to many decision contexts, especially those with minimal quantitative decision support. We present three solutions:

Decision solution 1 - The Noise Audit

Decision solution 2 - Quantification of the CRA decision

Decision solution 3 - Cognitive Risk and Controls

Decision solution 1 - The Noise Audit

If you have decision processes involving human judgment, you very likely have decision precision reducing noise. To start, a Noise Audit endeavors to answer the following questions:

How much noise do you have?
What is the noise impact on the decisions?
What should you do about the noise?

In a Noise Audit, multiple individuals judge the same problems. Noise is the variability of those judgments. The Noise Audit outcome is to determine the amount and implications of noise components (Level, Stable Pattern, and Occasion Noise). A Noise Audit is applicable to any decision process involving human judgment, even if the judgments already have quantitative support. Unlike bias, noise is 1) hard to detect without a statistically based audit and 2) is immediately available to be audited.

The following Noise Audit framework (5) is appropriate to evaluate the Compliance Risk Assessment process and may be generalized to audit most processes involving judgmental decisions. Promontory Financial Group provides Noise Audit and Compliance expertise to provide expert talent and third-party independence.

Based on the results, CRA operating changes may be made to reduce noise creation, and/or increased quantification may be implemented to reduce noise impact. Longer-term, it is important to perform regular Noise Audits. This will help you understand actual noise performance based on ongoing CRA changes. The effectiveness of the next two solutions, in part, may be measured by comparing the before and after solution implementation Noise Audits. An investment in Noise Audits will help you measure the effectiveness of your CRA transformation journey.

Decision solution 2 - Quantification of the CRA decision

CRA Quantification is an ongoing transformational process. Consider implementing an initial, simple version as a starting point to your longer-term transformation roadmap. Based on the prior "How a quantitative CRA reduces noise" section, even a 25% quantification of total Information Value will significantly reduce potential noise. The following are guiding principles to keep in mind for your CRA Quantification project:

Standard data set with baseline ratings calculated based on quantitative measures
Implemented for all Assessment Entities
Qualitative overlay based on clearly articulated guidelines
Outputs must be fit for purpose
Incorporate obligation and control data mentioned in the prior "CRA infrastructure foundation" section
Keep it Simple, resist getting bogged down in unnecessary complexity. A 25% information value solution is an initial win!

The following is a suggested starting point. It is a simple approach that leverages your foundation and your technology stack. In the event, you may lack infrastructure components, it is recommended to start the transformation and build your infrastructure as you build your CRA Quantification capability.

Decision solution 3 - Cognitive Risk and Controls

Many financial institutions have large populations of risk and control data that inhibit their effective management and compromise the efficiency of testing activities. Common challenges include:

Incomplete identification of risks within processes and controls in place to mitigate risks
Inconsistent ratings of risk severity and control strength
Insufficient front-line ownership of the process, risk, control, and testing data
Uncertainty regarding the accuracy of taxonomy assignments
Poor quality control descriptions, which impede effective testing
Inaccurate indicators denoting whether controls are manual or automated/preventive or detective
No distinction between key and non-key controls
Unclear descriptions of testing activities, which often resemble the descriptions of controls

Financial Services firms may enhance their risk and control datasets through the combined application of risk and compliance expertise and cognitive technologies. Promontory Financial Group, along with IBM, provides subject matter expertise and a scalable cognitive platform for solution implementation.

Conclusion

Precision and accuracy are very important for the quality of scientific research, which involves collecting data, and business activities that monitor quality in all the company’s products and processes. There are subtle differences between bias and noise in judgmental decision-making. Noise is likely the biggest concern for financial services decision-making leadership. We presented financial services examples and a compliance case study. Finally, we provided three actionable solutions to help you decrease noise, increase precision, and dramatically improve your decision quality.

Notes

(1) in the book Noise, referenced earlier, the authors make a helpful explanation of the difference between bias and noise, and why noise is harder to detect:

"Bias has a kind of explanatory charisma, which noise lacks. If we try to explain, in hindsight, why a particular decision was wrong, we will easily find bias and never find noise. Only a statistical view of the world enables us to see noise, but that view does not come naturally—we prefer causal stories. The absence of statistical thinking from our intuitions is one reason that noise receives so much less attention than bias does."

(2) Another limitation of QC - rarely is it an outcome independent process. Meaning, the checker is usually aware of the maker’s original decision. This lack of independence may lead to confirmation bias and error known as a bias cascade.

(3) Potential noise is not the same as actual noise. Potential noise is the total noise (decision variability) that could happen. Naturally, your current judgmental CRA process may have controls to reduce noise. The point is, why expose yourself to potential noise? When small increases in a quantitative CRA will significantly reduce potential noise.

Conceptually, per the underlying model represented in the graphic, noise increases at every human judgement made to increase information value. The model assumes there are 100 separate judgements. Even small judgments count. While your process may have more or less separate judgments, the point is to demonstrate the relative difference when more quantitative judgments are added. The relative differences are consistent regardless of the number of separate judgments. Also, noise is subject to sensitivity to initial conditions. That is, it is better to use quantitative judgment earlier in the decision process to reduce noise leverage.

(4) Decision Anchoring is related to Anchoring Bias. Which is one of humanity's many cognitive biases. By the way, individual cognitive bias should not be confused with the system bias and accuracy as is the subject of this article. They are two subtle but different applications of the word "bias." Individual cognitive bias, as a system input, may very well cause systems noise.

(5) Noise Audit sources:

Noise: A Flaw in Human Judgment by Daniel Kahneman, Olivier Sibony, Cass R. Sunstein

Criminal Sentences: Law Without Order by Marvin Frankel

Afterthoughts

10/24/20

Silver mentions in the Signal and the Noise -

“Financial crises--and most other failures of prediction stem from this false sense of confidence. Precise forecasts masquerade as accurate ones, and some of us get fooled and double-down our bets.”

Discussed in Buonomano’s Your Brain is a Time Machine -

Related to human circadian rhythms and our natural wake time, circadian rhythms are precise in that they are about the length of 23.5 hours. But the lack of accuracy compared to the 24-hour clock is what causes us (at least me) to need a “catchup” sleep every week or so. Perhaps this is why we have weekends?!

11/5/20

Relating to the Making the most of Statistics and Automation article:

I think of Accuracy as being related to the population specification itself. A well-specified population should lead to an accurate inference from the sample. Thus the sample result will be closer to the bull's eye.

I think of Precision as being related to tolerance rate and confidence level. The higher the precision (lower tolerance rate / higher confidence level) the more precise the sample inference will be. Thus the sample will have smaller potential inference volatility.

1 Comment

tradeexpertz

Nov 10, 2021

Making an exchanging plan is a basic part of fruitful exchanging. It ought to incorporate your benefit objectives, hazard resilience level, strategy and assessment models. When you have an arrangement set up, ensure each exchange you think about falls inside your arrangement's boundaries Forex Trading. Keep in mind: you're reasonable generally normal before you place an exchange and generally silly after your exchange is put.