The author

Nick Sime

Director of Fraud & Credit Risk Modelling

View profile
News & Views / Combined vs bespoke models in credit risk: Does segmentation still add value?
28 April 2026

Combined vs bespoke models in credit risk: Does segmentation still add value?

In credit risk modelling, segmentation has long been a standard approach. Segmenting customers into groups (new vs existing, prime vs sub-prime) has often been seen as the best way to improve model performance. You can understand the logic. Different populations behave differently, so modelling them separately should produce better results. 

This approach made sense when linear models dominated. But with the increased use of machine learning techniques like GBMs and DNNs, it’s worth revisiting that assumption.

We’ve tested this extensively and we can confidently say, in the world of Machine Learning, segmentation is not the right starting point. In most cases, a single ML model will outperform a set of bespoke models built on individual segments. 

In this article, we look at why that is, and why the process for the development of credit risk models needs to change.

Why segmentation was used in traditional credit risk models

When you think about it, segmentation largely came from the limitations of linear models. A standard logistic regression applies a single set of relationships across the entire population. It doesn’t naturally capture more complex interactions, such as how the impact of one variable might change depending on another.

Segmentation was a way around that (though cumbersome).

By splitting the population into groups, you allow those relationships to differ between segments. In effect, you’re enabling linear models to capture some non-linearity, though this is limited to patterns associated the chosen segmentation.

In many cases, this led to a modest improvement at development.

But the big challenge was identifying the right segmentation in the first place. A common approach was to test multiple segmentation options, each requiring the development and evaluation of two or more models. This quickly became a time-consuming process.

With hundreds of potential variables available, there was also no certainty that the best segmentation had even been found.

For a long time, that was simply part of the process.

How machine learning models reduce the need for segmentation

With machine learning models, segmentation is generally not required, unless there are structural differences in the data across sub-populations. For instance, when we build GBMs or DNNs on a combined population, we often see performance that is at least as strong as, and usually notably better than, a set of separate models built on individual segments.

This comes down to how these models learn.

Rather than applying a single global relationship, ML models can capture interactions directly from the data. The model can learn that a variable behaves differently for different parts of the population without needing those groups to be defined in advance.

This means a combined model is already doing what segmentation was designed to achieve, though it can capture all interactions, not just the non-linearity associated with the segmentation variable.

It is effectively learning across sub-populations at the same time, rather than treating them in isolation. We refer to this as cross-learning, which is one of the key strengths of machine learning.

ML models are much better at capturing non-linear relationships, and will outperform linear models regardless of whether segmentation is used. Whilst segmentation is associated with a model's uplift in linear modelling, the reverse is true in ML. Thanks to the power of cross-learning, a single ML model is enough, and attempts to use segmentation will likely be counterproductive.

This has been evident in the analysis over almost a decade. Some typical results are shown below:

Combined vs bespoke models in credit risk example uplift

[Uplift from segmented models compared to a combined model across multiple years.]

Why combined machine learning models can outperform segmented models

Essentially, the difference comes down to how the models use the data. Machine learning models are designed to capture interactions between variables as part of the training process. They don’t need those interactions to be defined upfront. If a relationship behaves differently across parts of the population, the model can learn that directly.

When you train a single model on a combined dataset, it has access to a much broader range of patterns. Behaviour observed in one part of the population can help inform predictions in another. That additional context can improve how well the model generalises.

With segmented models, that information is split across separate datasets. Each model only sees a subset of the data, which can limit what it learns, particularly where sample sizes are smaller.

This is why combined models perform so well. They benefit from both the scale and diversity of the full dataset, while still capturing the differences between sub-groups.

📚Related reading: Machine learning model stability: Do Gradient Boosting Machines (GBMs) and Deep Neural Networks (DNNs) really degrade faster?

More in the series: Do Gradient Boosting Machines (GBMs) and Deep Neural Networks (DNNs) really degrade over time? Sample size and model choice

Combined vs segmented models: what this means for credit risk teams

For many credit risk teams, segmentation has been the standard starting point. The assumption is that splitting the population will improve performance, so it’s built into the modelling process from the outset.

But what we’re seeing is that this isn’t necessary.

A combined model is the more effective approach for ML models, as it allows the model to learn from the full dataset and capture interactions directly.

Fewer models also mean less to maintain, monitor, and govern. Validation becomes simpler, implementation is more straightforward, and ongoing performance tracking is easier to manage.

📚Related reading: Sample size and model choice: When GBMs outperform DNNs in credit risk

Sample design in machine learning: does population diversity improve model performance?

Traditional linear models are heavily reliant on stable, predefined relationships between variables, not to mention the underlying assumption of statistical independence of model inputs.  When the underlying population drifts or the correlation structure shifts, performance can degrade quickly.

This is why there has historically been such a strong focus on population stability, with metrics like PSI used to track how closely live data matches the development sample. In turn, this has driven an emphasis on building models using historical data that is expected to reflect future business as closely as possible.

With machine learning models, the picture is quite different.

Because ML models can learn more flexible relationships, they are often able to handle a wider range of variation in the data. In some cases, including a broader mix of customer types, products, or time periods can actually improve how well the model generalises.

Rather than learning from a narrow, tightly defined sample, the model is exposed to more varied behaviour. That can make it more resilient when conditions or the live population change.

This doesn’t remove the need for monitoring. Population stability and performance tracking still matter. But it does suggest that a perfectly matched sample is not always the most useful objective when developing machine learning models.

Final thought: why segmentation is no longer needed

Segmentation has been a useful approach for a long time when working with linear models. But with machine learning, a combined model should give optimal performance. Segmentation is unlikely to improve results and will often make them worse, unless there are clear structural inconsistencies within the development data. 

This is a fundamental change.

With linear models, achieving acceptable performance often meant building and maintaining multiple segmented models, with all the associated effort and complexity, and still a weaker result than a single ML model.

With machine learning, that work is no longer needed. A single combined model is consistently stronger and much simpler to manage.

Archetype makes this easy to implement in practice. Models can be built and compared side by side using a consistent framework, without the need for complex coding, so teams can focus on outcomes rather than process. 

The bottom line is that segmentation is time-consuming to design, build, and maintain, and often delivers only limited incremental benefit. Machine learning models remove much of that complexity by learning these patterns directly from the data, allowing a single combined model to deliver stronger and more efficient outcomes.

If you’re exploring how to simplify your modelling approach, the easiest way to see the difference is to test it.

Archetype allows teams to build and compare combined and segmented models within the same framework, so you can quickly see whether segmentation is adding value or just adding complexity.

If you’d like to see how that works in practice, we’d be happy to talk.

More in the series: