Insights from ADM Models¶
The predictor binning from ADM models can deliver great insights to marketing and business and provides transparency into the models.
Both AGB and NB types of ADM models already provide overall predictor importance views like these:
[2]:
from pdstools import datasets
from pdstools import ADMDatamart, datasets
import polars as pl
dm = datasets.cdh_sample()
# replace this with your own datamart data, see PDS tools documentation for examples
# dm = ADMDatamart(
# model_filename="...",
# predictor_filename="...",
# )
fig = dm.plot.predictor_performance(top_n=10)
fig.update_layout(
height=400, width=800, title="Feature Importance", xaxis_title="", yaxis_title=""
)
fig.show()
While helpful to understand how a model works, a global perspective on the model features is not of much use to understand who accepts what. For this type of insight, we need to dive deeper into how the model partitions the customers. The Pega ADM models can provide such information.
Predictor Binning for Individual Models¶
The predictor binning for Baysian ADM models is available in Pega Prediction Studio and also in the downloadable, off-line model reports you can create with PDS tools. These reports are available for both the legacy R and the new Python versions of the toolkit.
You can create these reports directly from the PDS tools app, or work in an IDE, for more information see the PDS Tools Documentation.
You can easily recreate all or parts of this in code. For example, to recreate the binning reports use code like shown below. You’ll need the model ID that you can pick up from Prediction Studio or from a quick analysis with PDS tools functions.
[3]:
fig = dm.plot.predictor_binning(
model_id="08ca1302-9fc0-57bf-9031-d4179d400493",
predictor_name="Customer.AnnualIncome",
)
fig.update_layout(height=400, width=700, xaxis_title="")
fig.show()
You can also view these reports in an alternative way, focussing on how each of the bins pushes the propensity above or below the average. This perspective on the “lift” of each bin is similar to the yellow line in the above plot but emphasizes it more.
[4]:
fig = dm.plot.binning_lift(
model_id="08ca1302-9fc0-57bf-9031-d4179d400493",
predictor_name="Customer.AnnualIncome",
)
fig.update_layout(height=400, width=700, xaxis_title="")
fig.show()
Rolling up individual binning¶
While already giving more insight than global predictor importance, being specific to one individual ADM instance means you may have to browse through 100’s of individual model reports to get a feel for how, for example, Income is related to Cards acceptance.
With the “BinAggregator” class in PDS Tools you can now “roll up” these binning views across actions and even across channels. For example this could show that “People with income > 60000 and age > 30” are more likely to respond positively to Cards offers.
The rolling up is based on the predictor binning from ADM NB models. For a numeric predictor, we first create equi-distance or log-distance bins, for example, 10 bins from 20 to 85 for “Age” or 10 bins in a log scale from 10k to 10m for “income”.
Then, the predictor binning of all models (in a certain group, issue perhaps) is mapped onto those target bins and the values are mapped proportional to the overlap between the model bin and the target bin. We map the lift values, not the bin counts. The bin counts are heavily dependent on channel both in absolute numbers as in the ratio between positives and negatives (success rate) so aggregating those does give meaningful results. The lift is an indication of how a certain predictor value range pushes the likelhood to accept up or down.
To illustrate this, see below. We have a equi-distance target binning for Age, shown in red, from 20 to 80. From one of the models we have binning of Age that has a slightly different range (10 - 75) - shown in blue. The first bin of the target gets a weighted lift from source bins 1 and 2. The second target bin falls completely within the range of bin 2 of the source, so gets the exact same lift value. Same for the target bins 3 and 4, they are both sourced from just source bin 3. The 4th one in not fully covered however, as you see reflected in the “BinCoverage” column in the table.
[5]:
# For PDS tools example keep dm as above but the subset argument is important
myAggregator = dm.bin_aggregator
target = myAggregator.create_empty_numbinning("Customer.Age", 4, minimum=20, maximum=80)
source = pl.DataFrame(
{
"ModelID": [1] * 3,
"PredictorName": ["Customer.Age"] * 3,
"BinIndex": [1, 2, 3],
"BinLowerBound": [10.0, 25.0, 50.0],
"BinUpperBound": [25.0, 50.0, 75.0],
# "BinSymbol" : ["20-25", "25-50", "50-75"],
"Lift": [0.4, -0.1, 2.0],
"BinResponses": [100, 1000, 400],
}
)
myAggregator.plot_binning_attribution(source, target).update_layout(width=700)
The result of mapping the source binning onto this target is shown below. The resulting Lift is the average lift of the overlapping segments, weighted by overlap. It is not weighed by response count like we usually do for model performance etc, as this would skew the numbers heavily towards the positive bins (as generally the actions will be selected where the bins are scoring higher). The BinResponses is an indication of the number of responses (postive plus negative) for the bin (but is not used, only provided for additional insights). BinCoverage is the sum of the coverage by all the models for this new bin. It cannot be higher than the number of models (Models) - some models are empty and not taken into account at all, or they have a value range smaller than the combined binning.
[6]:
myAggregator.combine_two_numbinnings(source, target)
[6]:
PredictorName | BinIndex | BinLowerBound | BinUpperBound | BinSymbol | Lift | BinResponses | BinCoverage | Models |
---|---|---|---|---|---|---|---|---|
str | i64 | f64 | f64 | str | f64 | f64 | f64 | i32 |
"Customer.Age" | 1 | 20.0 | 35.0 | "<35.0" | 0.066667 | 433.333333 | 1.0 | 1 |
"Customer.Age" | 2 | 35.0 | 50.0 | "<50.0" | -0.1 | 600.0 | 1.0 | 1 |
"Customer.Age" | 3 | 50.0 | 65.0 | "<65.0" | 2.0 | 240.0 | 1.0 | 1 |
"Customer.Age" | 4 | 65.0 | 80.0 | "<80.0" | 2.0 | 160.0 | 0.666667 | 1 |
So you can roll up Age over all of the models in this sample data set and visualize how age affects propensity across all models:
[7]:
fig = myAggregator.roll_up("Customer.Age")
fig.update_layout(height=300, width=600)
fig.show()
The above is a view across all models in all channels and may not be that meaningful. It roughly says that both young and elderly people are more likely to respond than middle-agers. That in itself may not be that insightful.
It generally is much more useful to compare this distribution across different issues, groups or other dimensions of interest. For example, showing how Age has a different relation with different groups of actions can be done with the following:
[8]:
fig = myAggregator.roll_up(
"Customer.Age", minimum=20, maximum=80, n=5, aggregation="Group"
)
fig.update_layout(height=500)
fig.show()
If you’re interested in the underlying data rather than just the plot, use the ‘return_df’ argument - like in many of the PDS Tools plot functions.
[9]:
myAggregator.roll_up(
"Customer.Age", minimum=20, maximum=80, n=5, aggregation="Group", return_df=True
).to_pandas().style.hide()
[9]:
PredictorName | BinIndex | BinLowerBound | BinUpperBound | BinSymbol | Lift | BinResponses | BinCoverage | Models | Group |
---|---|---|---|---|---|---|---|---|---|
Customer.Age | 1 | 20.000000 | 32.000000 | <32.0 | 0.229643 | 4398.940698 | 10.000000 | 10 | AutoLoans |
Customer.Age | 2 | 32.000000 | 44.000000 | <44.0 | -0.261505 | 8280.450496 | 10.000000 | 10 | AutoLoans |
Customer.Age | 3 | 44.000000 | 56.000000 | <56.0 | 0.281045 | 4540.748856 | 10.000000 | 10 | AutoLoans |
Customer.Age | 4 | 56.000000 | 68.000000 | <68.0 | 0.501178 | 1272.362768 | 10.000000 | 10 | AutoLoans |
Customer.Age | 5 | 68.000000 | 80.000000 | <80.0 | 0.510309 | 965.237926 | 8.166667 | 10 | AutoLoans |
Customer.Age | 1 | 20.000000 | 32.000000 | <32.0 | 0.110590 | 995.724951 | 7.000000 | 7 | HomeLoans |
Customer.Age | 2 | 32.000000 | 44.000000 | <44.0 | -0.078623 | 1975.071325 | 7.000000 | 7 | HomeLoans |
Customer.Age | 3 | 44.000000 | 56.000000 | <56.0 | 0.047313 | 1350.110244 | 7.000000 | 7 | HomeLoans |
Customer.Age | 4 | 56.000000 | 68.000000 | <68.0 | 0.145987 | 251.370112 | 7.000000 | 7 | HomeLoans |
Customer.Age | 5 | 68.000000 | 80.000000 | <80.0 | 0.167526 | 209.475093 | 5.083333 | 7 | HomeLoans |
Customer.Age | 1 | 20.000000 | 32.000000 | <32.0 | 0.038873 | 247.676726 | 1.000000 | 1 | CreditCards |
Customer.Age | 2 | 32.000000 | 44.000000 | <44.0 | -0.598058 | 324.579816 | 1.000000 | 1 | CreditCards |
Customer.Age | 3 | 44.000000 | 56.000000 | <56.0 | 0.996987 | 82.989379 | 1.000000 | 1 | CreditCards |
Customer.Age | 4 | 56.000000 | 68.000000 | <68.0 | 1.007692 | 81.367929 | 1.000000 | 1 | CreditCards |
Customer.Age | 5 | 68.000000 | 80.000000 | <80.0 | 1.007692 | 67.806607 | 0.833333 | 1 | CreditCards |
Customer.Age | 1 | 20.000000 | 32.000000 | <32.0 | 0.155355 | 1749.921628 | 2.000000 | 2 | Bundles |
Customer.Age | 2 | 32.000000 | 44.000000 | <44.0 | -0.124786 | 3979.820578 | 2.000000 | 2 | Bundles |
Customer.Age | 3 | 44.000000 | 56.000000 | <56.0 | 0.244397 | 2843.339271 | 2.000000 | 2 | Bundles |
Customer.Age | 4 | 56.000000 | 68.000000 | <68.0 | -0.139278 | 502.650345 | 2.000000 | 2 | Bundles |
Customer.Age | 5 | 68.000000 | 80.000000 | <80.0 | -0.209373 | 331.583984 | 1.666667 | 2 | Bundles |
The boundaries of the bin intervals are by default created automatically, but can be given explicitly. Income, wealth etc are typically distributed very unevenly (with a long right tail) so you can tell the system to use a logarithmic scale, which means the boundaries are a multiple of eachother, unlike the even spacing you can when not specifying the distribution (or using “lin”). Below we split by Channel and define a few explicit income boundaries:
[10]:
fig = myAggregator.roll_up(
"Customer.AnnualIncome",
boundaries=[10000, 20000, 30000],
n=8,
distribution="log",
aggregation="Channel",
)
fig.update_layout(height=300)
fig.show()
Symbolic Predictors¶
For symbolic (‘categorical’) predictors the process is conceptually simpler. We first extract the symbols from the bin labels, then aggregate the lift for the symbols up. For both types, the aggregated lift is weighted proportional to the responses of the models.
While the way symbolic predictors are aggregated is very different, the process to roll them up is similar to that of numeric predictors. You can pass in some explicit symbols to consider, set the maximum and aggregate over some dimension like Issue or Group exactly as you can for numeric predictors.
[11]:
fig = myAggregator.roll_up("Customer.MaritalStatus")
fig.update_layout(height=300, width=600)
fig.show()
Multiple predictors at once¶
If you want to look at the aggregated lift of the top-5 predictors you can also do that. Instead of one predictor name, you can pass a list. You can even do this in combination with splitting on e.g. Group or Issue although this may be a little overwhelming.
Remember that you can always subset to a subset of the models when creating the BinAggregator.
[12]:
top_predictors = sorted(list(set(dm.plot.predictor_performance(
top_n=5,
# query=pl.col("PredictorCategory")
# == "Customer", # skip the "IH" predictors, taking only the ones prefixed with "Customer."
# TODO this query subset doesnt seem to work currently
return_df=True,
).filter(pl.col("PredictorCategory") == "Customer").collect()['PredictorName'].to_list())))
fig = myAggregator.roll_up(top_predictors, n=6, aggregation="Group")
fig.update_layout(height=600)
fig.for_each_annotation(
lambda a: a.update(text=a.text.split(".")[-1])
) # trick to show just the part of the predictor names after the dot
fig.show()