Money, Math, and Mistakes: Training Sophisticated Neural Networks for the Realities of Banking
- Jeff Hulett
- 7 minutes ago
- 9 min read

Introduction: Bridging the Gap at JMU
This narrative is based on a program designed for undergraduate students in the James Madison University Business Analytics program. As you begin your journey into the world of data science, it is easy to focus entirely on the "how"—the code, the math, and the algorithms.
However, in the world of banking, the "why" and the "where" are just as important. This session is intended to provide you with the professional context for how neural networks and related predictive modeling are actually deployed in a highly regulated industry. We will explore the "translation gap" between a laboratory model and a real-world lending decision, where practical constraints like law, ethics, and governance often dictate the math we are allowed to use.
The Setup: The "Time of Decision"
Imagine you are a bank executive deciding whether to approve a loan. At the Time of Decision, you don’t know the future, but you have a folder full of historical data—these are your Independent Variables or Features (e.g., credit utilization, payment history, and income).
The outcome you want to predict (Did they default or not?) is the Dependent Variable. In our historical data, this is recorded as a 1 (Default) or 0 (Paid in full). Our goal is to tune a model that looks at those features and gets as close to that 1 or 0 as possible.
Phase 1: The Mechanics of Learning
1. The Forward Pass (The Weighting Game)
When we start training a neural network, the model begins with a "best guess." It assigns a numerical Weight to every feature. Think of a weight as the Importance Score or Sensitivity of a specific piece of data.
In the Forward Pass, the model takes a customer’s data and multiplies each piece of information by its assigned weight.
For example, if the model thinks "Past Delinquencies" is a huge red flag, it will assign that feature a very high weight.
If it thinks "Favorite Color" is irrelevant, it assigns a weight of zero.
The model sums up all these weighted inputs to calculate a single prediction. The goal of this calculation is to see how close the current weights can get to correctly predicting the Dependent Variable (the 1 or the 0). On the first try, the weights are often random, so the model might predict a 0.50 (a coin flip) for someone who actually defaulted.
2. The Loss Function (The Scorecard)
We compare that prediction to reality. If the model predicted a 20% chance of default, but the customer actually defaulted (a 1), our model was wrong. We use a Loss Function (like Mean Squared Error or MSE) to quantify exactly how far off we were. Think of the "Loss" as the financial penalty the bank pays for being wrong.
3. Backpropagation (The Post-Mortem)
Once we know the error, we have to figure out which weights are to blame.
Backpropagation is like a bank manager conducting a post-mortem on a bad loan. We work backward from the error at the end of the process, through the layers of the network, to the beginning.
Using a calculus tool called the Chain Rule, we calculate exactly how much each weight contributed to the final error.
If the "Income" weight was too low, causing us to miss a default risk, Backpropagation calculates exactly how much to nudge that "dial" to improve the next guess.
4. Iteration (The Learning Loop)
This is an iterative process. The model performs a forward pass, measures the error, and uses backpropagation to adjust the weights. It repeats this thousands of times across your entire historical portfolio until the weights are tuned so precisely that the error can get no lower.
Phase 2: Practical Constraints (When Math Meets Reality)
In a textbook, the goal is the lowest error possible. In a bank, we must balance that error with four critical real-world constraints:
The Overfitting Trap: Because loan defaults happen far into the future and data can be highly correlated, a model might "memorize" the past rather than "predict" the future. We often use Model Intuition to simplify a model or "dampen" certain weights, even if it means giving up some precision on paper. We would rather have a model that fits the long-term reality better than one that perfectly mimics a specific historical dataset.

Consumer Protection & Fair Lending: Federal law prohibits making credit decisions based on "Protected Classes" (e.g., race, gender, sexual orientation, or age). We must strictly eliminate any variables—or "proxies" for those variables—that are impermissible under US law, regardless of how much "predictive power" they might have.
Explainability (Regulation B): If a bank declines a loan, we are legally required to provide an Adverse Action Notice. This means the model cannot be a "black box." The weights must be clear and explainable so we can tell a customer exactly why they were declined (e.g., "Your credit utilization was too high").
Model Governance: Banking law requires rigorous validation to ensure accuracy and stability. If a model is "unstable" (meaning its weights swing wildly with small data changes), it will fail governance. Unstable models create massive regulatory costs, so we prioritize steady, reliable models over complex, temperamental ones.
Phase 3: The FICO® Standard & Human "Backprop"
Before neural networks, there was the FICO Score.
The Tech: It is primarily an "old school" Logistic Regression model.
The Human Element: For 30+ years, human analysts have been "feature engineering"—manually tuning the weights of data from the three bureaus (TransUnion, Equifax, Experian).
Systemic Discrimination: Historically, FICO was criticized for favoring "banked" individuals (traditionally white populations). Today, it uses non-bank data (rent, utilities) to include racially diverse folks and immigrants who were previously invisible to the system.
Phase 4: The Evolution of Agency (Human vs. AI)
As modeling matures, we must decide how much Agency (decision-making power) to delegate to the machine. This is about having "Skin in the Game."
1st Gen: Manual Learning (Human High): The bedrock of banking. Analysts use Logistic Regression or Survival Analysis to define the "map." High human agency ensures high attention to detail and "skin in the game."
2nd Gen: Supervised Learning (Balanced): Uses "Ensemble" methods like XGBoost. The machine finds the path, but the human enforces constraints to ensure conceptual soundness.
3rd Gen: Deep Learning (AI High): Neural Networks identify complex, non-linear patterns in unstructured data—the gold standard for Fraud Detection. Here, the human acts as an Auditor and investigator for false negatives.
The Banker’s Caveat: AI is excellent at predicting based on the recent past, but poor at predicting the long-term future. Over time, prediction quality decays and volatility increases. Because the distant future is fundamentally unknowable (think 2008 or COVID-19), we cannot rely on algorithms alone. In banking, Automation without Intuition is a recipe for systemic risk.
Summary Table for Students
Data Science Term | Banking Analogy |
Weights | The Importance assigned to each piece of customer data (e.g., how much the bank "weights" a late payment). |
Forward Pass | Making the Loan Decision: Multiplying data by weights to predict, "Will they default?" |
Loss Function | The Financial Scorecard: The "Penalty" or dollar-cost of making a wrong guess. |
Backpropagation | The "Blame Game": Working backward to find which weights to adjust to lower future error. |
Explainability | Regulation B: The legal requirement to tell a customer "Why" they were declined. |
Overfitting | Memorizing the Past: When a model mistakes a historical coincidence (like "Tuesdays") for a real trend. |
Agency | "Skin in the Game": The level of delegation and accountability between the Human and the Machine. |
The "Junior Credit Officer" Thought Experiment
The Scenario: You have just been hired as a Junior Credit Officer. Your boss gives you a pile of 1,000 past loan applications. Some are marked with a Red Stamp (Defaulted) and some with a Green Stamp (Paid Back).
Your job is to create a "Secret Formula" (a Model) to predict future applicants. You have two main pieces of information (Features) for every person:
Credit Utilization: How much of their credit card limit they currently use.
Years at Current Job: How long they’ve been with their employer.
Step 1: The First Guess (Forward Pass)
Without looking at the folders yet, you decide both factors are equally important.
Your Weights: You give a weight of 0.5 to Credit Utilization and 0.5 to Job Stability.
The Prediction: You pick up a folder. The applicant has high credit use but has been at their job for 20 years. Your formula averages these out and predicts: "50% chance of default."
Step 2: The Reality Check (Loss)
You open the folder. It has a Red Stamp. They defaulted.
The Error: Your 50% guess was far away from the reality of 100% default. The bank just lost $10,000. This is your Loss.
Step 3: The Adjustment (Backpropagation)
You look at the next 100 folders. You notice a pattern: People with 20 years on the job are defaulting just as often as people with 1 year on the job, but everyone with high credit utilization is defaulting.
The Logic: You realize your "Job Stability" weight is useless, and your "Credit Utilization" weight is too low.
The Action: You "Backpropagate" this error. You turn the dial down on the Job weight and turn the dial up on the Utilization weight.
The "Class Discussion" Questions
Q1: The Mathematical Nudge If you see that your model is consistently underestimating the risk of default, in which direction does Backpropagation move the weights? Does it make them larger or smaller?
Q2: The "Overfit" Trap Suppose you notice that every person who defaulted also happened to apply for the loan on a Tuesday. Your neural network discovers this and gives "Tuesday" a very high weight.
As a banker, do you keep that weight in your model? Why might this be "accurate" for your 1,000 folders but "wrong" for the future?
Q3: The Ethics Dilemma (Fair Lending) You find a variable that is a 99% perfect predictor of default, but it is a "proxy" for a protected class (like the neighborhood someone lives in).
If you remove this highly accurate variable to comply with the law, what happens to your "Loss" (error) in the short term? Why is this a trade-off the bank is willing to make?
Q4: The FICO Strategy (Arbitrage) How do banks actually use the FICO score? Do they treat it as the final word, or is it just one tool in the box?
As a banker, do you consider FICO the Yes/No score to make a loan decision or do you consider it a competitive indicator and rely on your internal model to identify where the competition is mispricing?
Summary for the Students
By the end of this exercise, the students should realize that:
Backprop is just the model "learning from its mistakes" by adjusting the importance of variables.
Weights are the knobs and dials that define the bank’s strategy.
Banking Reality means sometimes we turn a dial to "Zero" (like the Tuesday or the Neighborhood example) even if the math says it's predictive, because we value stability and fairness over raw accuracy.
Resources For The Curious
1. Visualizing the Math (The "Backprop" Engine)
If you want to see the Chain Rule in action without getting lost in a textbook, these are the gold standard:
3Blue1Brown – Neural Networks Playlist: Grant Sanderson provides the most intuitive visual explanation of how backpropagation actually "nudges" weights. Watch the video “But what is a neural network?” and its sequels on backprop.
StatQuest with Josh Starmer: For a clear breakdown of Logistic Regression (1st Gen) and XGBoost (2nd Gen), Josh’s "triple-bam" explanations are unbeatable for undergraduates.
2. The Banking Industry & FICO History
To understand the "Human Backprop" and the history of credit:
FICO’s Official Blog: Search for their posts on "Explainable AI" (xAI). They discuss exactly how they maintain transparency while trying to incorporate modern machine learning.
"The Information-Based Strategy" (Capital One's Origin): Research the history of Capital One’s founding. They were the first to treat credit cards as a big-data problem rather than just a banking problem, effectively inventing the "Arbitrage" strategy mentioned in Q4.
3. Ethics, Bias, and Systemic Risk
Since banking is as much about society as it is about math:
"Weapons of Math Destruction" by Cathy O’Neil: A must-read for any data scientist. She explores how algorithms—especially in lending—can reinforce systemic discrimination if the data scientists aren't careful.
The CFPB (Consumer Financial Protection Bureau): Look up their reports on "Algorithmic Bias." It will show you the real-world regulatory hurdles you will face when your model goes live.
4. Philosophy of Risk & Agency
To understand the "Skin in the Game" concept from Phase 4:
"Skin in the Game" by Nassim Nicholas Taleb: This book explores why the delegation of agency to systems (like AI) without human accountability can lead to "Black Swan" events like the 2008 financial crisis.
"The Signal and the Noise" by Nate Silver: Specifically, the chapter on the 2008 housing bubble, which explains why models based on the "knowable past" failed to account for a changing future.
A Final Note to Students
As you move forward in the Business Analytics program, remember: The model is a tool, not a decision-maker. Your value as a future analyst isn't just in your ability to write the code—it’s in your ability to look at a model’s output and ask: "Is this fair? Is this stable? And do I have the intuition to know when the machine is wrong?"

