The author

Nick Sime

Director of Fraud & Credit Risk Modelling

View profile
News & Views / 14 lessons in predictive modelling to strengthen credit risk assessments
07 November 2024

14 lessons in predictive modelling to strengthen credit risk assessments

Modern credit risk management relies heavily on predictive modelling, which has come a long way from older methods. As lending gets more complex, companies using advanced AI and machine learning can better understand and manage their risks. 

Here, Nick Sime, Director of Fraud & Credit Risk Modelling, has put together 14 key tips from his experience to help risk managers use predictive modelling to make smarter, safer lending decisions.

1. Machine Learning models consistently outperform

Machine learning (ML) models consistently outperform traditional linear models on independent test samples. While the degree of improvement varies, we typically see a 10-15% uplift in Gini compared to redeveloped logistic regression models. For credit risk, this translates to a potential 20% reduction in the bad rate at the same cut-off.

2. Sample size matters

The larger the sample, the more ML models can identify complex, non-linear patterns, resulting in a performance boost. However, material improvements are still achievable even with smaller, low-default portfolios.

3. The optimal number of features: 40-60

Bureau data is becoming more complex as Credit Reference Agencies use additional data sources and derive trended variables. This presents a data reduction challenge to modellers. On top of this, creating models with an excessive number of variables creates an overhead for deployment and monitoring. Our experience shows that near-optimal performance within credit score developments can be obtained with 40-60 variables.  

4. Some overfitting is necessary

Overfitting is often seen as negative, but ML models thrive on capturing nuanced patterns. Strict overfitting control can actually reduce a model’s predictive power. Our research, however, shows that models with excessive overfitting degrade faster, so a balanced approach is key for long-term stability. Essentially, a nuanced approach is required to optimise performance in a live environment.

5. Explainability constraints are not a barrier 

To support model explainability, monotonicity and ranking constraints are applied ‘up front’ in the design of our models. This ensures that the marginal impact of input variables is consistent with business expectations. While some fear this may reduce performance, we find that it has negligible, if any, adverse impact. In fact, it can even benefit model stability over time.

6. Stability over time

Despite their complexity, ML models can be remarkably stable. Long-term analysis reveals that Jaywing’s Deep Learning models, developed using Archetype, degrade more slowly over time than traditional logistic regression models.

📕Further guidance: Read The Performance Stability of Advanced (AI/ML) Models VS Linear Models

Advanced vs linear credit risk modelling

7. One & done (Goodbye to segmented models)

In traditional modelling, developers often create segmented models to capture non-linear relationships. However, ML models naturally capture these trends, rendering segmented models unnecessary in most cases.

8.  Reject inference needs special care

Scorecard developers will typically create a known good bad (KGB) model, an accept reject (AR) model, applying negative assumptions to rejects to create a dataset to build a final model that removes selection bias. ML models are intelligent and can effectively reverse engineer the inference for the declined cases in the sample, meaning the final model predictions for known cases are very similar to the KGB model negating the benefit of the inference process.

At Jaywing, we’ve developed methods to prevent this, ensuring the efficacy of reject inference while preserving the benefits of the inference process.

9.  Cross-learning (More is more)

Traditional scorecard developers have put high emphasis on ensuring development samples are reflective of future expectations. Jaywing analysis shows that this is not always the best approach for advanced models, as ML models can effectively cross-learn from adjacent data sources resulting in more powerful models.

10.  Hyper-parameter (Avoid complication)

Tuning of hyperparameters controls the structure of the ML model and the learning process. This is often tackled by a grid search and requires the estimation of a model for each hyper-parameter combination. This presents a significant processing overhead, and the results of the various iterations are very similar. We recommend a Bayesian approach, which helps streamline the process and achieve optimal settings more efficiently.

11.  Continue to monitor

Monitoring is essential to detect any stability issues and ensure optimal performance. With the greater number of inputs in ML models, dashboards can be invaluable for pinpointing areas that may need adjustment. Whilst monitoring will give you a strong indication your model is sub-optimal it will not tell you if it is optimal.

12.  Always optimal

To maximise commercial benefits, Jaywing follows a philosophy of “always optimal” modelling. Periodically re-optimising models with fresh data allows us to capture any significant shifts while maintaining the existing model framework. Testing new data sources should be part of this process.

13.  Deployment can be simple

While deployment can be challenging with legacy systems, Jaywing offers flexible deployment options, including APIs and local deployment solutions. We’ve developed streamlined methods to rapidly deploy within most major systems.

14.  Domain knowledge is essential

While automation in model development is possible, domain expertise remains crucial. Involving experienced credit practitioners ensures that the model inputs are sensible and aligned with business needs, avoiding features that may be counter-intuitive or problematic.

By leveraging Jaywing’s award-winning ML modelling platform, Archetype, businesses can access cutting-edge neural network models without needing to code. Through a user-friendly interface, Archetype simplifies advanced ML for both fintechs and traditional lenders—giving those with the best models a clear competitive advantage.

Our approach is so advanced, that we’ve even won industry awards. 8 in total over the last six years. Here are a few:

Keen to learn more? You can see practical uses of Archetype here:

Whilst many of the early adopters of ML models were agile Fintechs, traditional banks and lenders are now showing increased interest. In a market where aggregators and brokers play such a key role, the alignment of risk and price is essential. Lenders with the most powerful models have a clear competitive advantage.

Taking action: Better risk management through prediction

Traditional methods can't keep up with today's lending challenges – but AI-powered models can spot hidden risks and opportunities that others miss. Smart lenders are already using data to make sharper decisions that protect and grow their businesses. And that’s where Archetype comes in. Our award-winning tool puts this power in your hands, with models that are both sophisticated and transparent.

Get in touch today to learn how Archetype could help you. 

More AI-based insights. Read: