ADM Explained

Pega

2023-03-15

This notebook shows exactly how all the values in an ADM model report are calculated. It also shows how the propensity is calculated for a particular customer.

We use one of the shipped datamart exports for the example. This is a model very similar to one used in some of the ADM PowerPoint/Excel deep dive examples. You can change this notebook to apply to your own data.

[2]:
import polars as pl
import numpy as np

import plotly.express as px
from math import log
from great_tables import GT
from pdstools import datasets
from pdstools.utils import cdh_utils
[3]:
model_name = "AutoNew84Months"
predictor_name = "Customer.NetWealth"
channel = "Web"

For the example we pick one particular model over a channel. To explain the ADM model report, we use one of the active predictors as an example. Swap for any other predictor when using different data.

[4]:
dm = datasets.cdh_sample()

model = dm.combined_data.filter(
    (pl.col("Name") == model_name) & (pl.col("Channel") == channel)
)

modelpredictors = (
    dm.combined_data.join(
        model.select(pl.col("ModelID").unique()), on="ModelID", how="inner"
    )
    .filter(pl.col("EntryType") != "Inactive")
    .with_columns(
        Action=pl.concat_str(["Issue", "Group"], separator="/"),
        PredictorName=pl.col("PredictorName").cast(pl.Utf8),
    )
    .collect()
)

predictorbinning = modelpredictors.filter(
    pl.col("PredictorName") == predictor_name
).sort("BinIndex")

Model Overview

The selected model is shown below. Only the currently active predictors are used for the propensity calculation, so only showing those.

[6]:
Overview
Action Sales/AutoLoans
Channel Web
Name AutoNew84Months
Active Predictors Classifier, Customer.Age, Customer.AnnualIncome, Customer.BusinessSegment, Customer.CLV, Customer.CLV_VALUE, Customer.CreditScore, Customer.Date_of_Birth, Customer.Gender, Customer.MaritalStatus, Customer.NetWealth, Customer.NoOfDependents, Customer.Prefix, Customer.RelationshipStartDate, Customer.RiskCode, Customer.WinScore, Customer.pyCountry, IH.Email.Outbound.Accepted.pxLastGroupID, IH.Email.Outbound.Accepted.pxLastOutcomeTime.DaysSince, IH.Email.Outbound.Accepted.pyHistoricalOutcomeCount, IH.Email.Outbound.Churned.pyHistoricalOutcomeCount, IH.Email.Outbound.Loyal.pxLastOutcomeTime.DaysSince, IH.Email.Outbound.Rejected.pyHistoricalOutcomeCount, IH.SMS.Outbound.Accepted.pxLastGroupID, IH.SMS.Outbound.Accepted.pyHistoricalOutcomeCount, IH.SMS.Outbound.Churned.pxLastOutcomeTime.DaysSince, IH.SMS.Outbound.Loyal.pxLastOutcomeTime.DaysSince, IH.SMS.Outbound.Loyal.pyHistoricalOutcomeCount, IH.SMS.Outbound.Rejected.pxLastGroupID, IH.SMS.Outbound.Rejected.pyHistoricalOutcomeCount, IH.Web.Inbound.Accepted.pxLastGroupID, IH.Web.Inbound.Accepted.pyHistoricalOutcomeCount, IH.Web.Inbound.Loyal.pxLastGroupID, IH.Web.Inbound.Loyal.pyHistoricalOutcomeCount, IH.Web.Inbound.Rejected.pxLastGroupID, IH.Web.Inbound.Rejected.pyHistoricalOutcomeCount, Param.ExtGroupCreditcards
Model Performance (AUC) 77.4901

Binning of the selected Predictor

The Model Report in Prediction Studio for this model will have a predictor binning plot like below.

All numbers can be derived from just the number of positives and negatives in each bin that are stored in the ADM Data Mart. The next sections will show exactly how that is done.

Predictor information
Predictor Name Customer.NetWealth
# Responses 1636.0
# Bins 8
Predictor Performance(AUC) 72.2077
[8]:
Binning statistics
Range/Symbol Responses (%) Positives Positives (%) Negatives Negatives (%) Propensity (%) ZRatio Lift
<11684.56 26.65% 13 6.31% 423 29.58% 2.98% −11.19 0.24
[11684.56, 13732.56> 12.35% 24 11.65% 178 12.45% 11.88% −0.33 0.94
[13732.56, 16845.52> 16.32% 17 8.25% 250 17.48% 6.37% −4.26 0.51
[16845.52, 19139.28> 14.06% 51 24.76% 179 12.52% 22.17% 3.91 1.76
[19139.28, 20286.16> 5.50% 7 3.40% 83 5.80% 7.78% −1.71 0.62
[20286.16, 22743.76> 13.57% 53 25.73% 169 11.82% 23.87% 4.40 1.90
[22743.76, 23890.64> 5.50% 13 6.31% 77 5.38% 14.44% 0.52 1.15
>=23890.64 6.05% 28 13.59% 71 4.97% 28.28% 3.51 2.25
Total 100.00% 206 100.00% 1430 100.00% 12.59% 0.00 1.00

Bin Statistics

Positive and Negative ratios

Internally, ADM only keeps track of the total counts of positive and negative responses in each bin. Everything else is derived from those numbers. The percentages and totals are trivially derived, and the propensity is just the number of positives divided by the total. The numbers calculated here match the numbers from the datamart table exactly.

[9]:
binning_derived = predictorbinning.select(
    pl.col("BinSymbol").alias("Range/Symbol"),
    BinPositives.alias("Positives"),
    BinNegatives.alias("Negatives"),
    ((BinPositives + BinNegatives) / (sumPositives + sumNegatives)).alias(
        "Responses %"
    ),
    (BinPositives / sumPositives).alias("Positives %"),
    (BinNegatives / sumNegatives).alias("Negatives %"),
    (BinPositives / (BinPositives + BinNegatives)).round(4).alias("Propensity"),
)

pcts = ["Responses %", "Positives %", "Negatives %", "Propensity"]
GT(binning_derived).tab_header("Derived binning statistics").tab_style(
    style=style.text(weight="bold"), locations=loc.body(columns="Range/Symbol")
).tab_style(
    style=style.text(color="blue"),
    locations=loc.body(columns=pcts),
).fmt_percent(pcts).tab_options(table_margin_left=0)

[9]:
Derived binning statistics
Range/Symbol Positives Negatives Responses % Positives % Negatives % Propensity
<11684.56 13.0 423.0 26.65% 6.31% 29.58% 2.98%
[11684.56, 13732.56> 24.0 178.0 12.35% 11.65% 12.45% 11.88%
[13732.56, 16845.52> 17.0 250.0 16.32% 8.25% 17.48% 6.37%
[16845.52, 19139.28> 51.0 179.0 14.06% 24.76% 12.52% 22.17%
[19139.28, 20286.16> 7.0 83.0 5.50% 3.40% 5.80% 7.78%
[20286.16, 22743.76> 53.0 169.0 13.57% 25.73% 11.82% 23.87%
[22743.76, 23890.64> 13.0 77.0 5.50% 6.31% 5.38% 14.44%
>=23890.64 28.0 71.0 6.05% 13.59% 4.97% 28.28%

Lift

Lift is the ratio of the propensity in a particular bin over the average propensity. So a value of 1 is the average, larger than 1 means higher propensity, smaller means lower propensity:

[10]:
positives = pl.col("Positives")
negatives = pl.col("Negatives")
sumPositives = pl.sum("Positives")
sumNegatives = pl.sum("Negatives")
GT(
    binning_derived.select(
        "Range/Symbol",
        "Positives",
        "Negatives",
        (
            (positives / (positives + negatives))
            / (sumPositives / (positives + negatives).sum())
        )
        .round(4)
        .alias("Lift"),
    )
).tab_style(
    style=style.text(weight="bold"), locations=loc.body(columns="Range/Symbol")
).tab_style(
    style=style.text(color="blue"), locations=loc.body(columns=["Lift"])
).tab_options(table_margin_left=0)
[10]:
Range/Symbol Positives Negatives Lift
<11684.56 13.0 423.0 0.2368
[11684.56, 13732.56> 24.0 178.0 0.9436
[13732.56, 16845.52> 17.0 250.0 0.5057
[16845.52, 19139.28> 51.0 179.0 1.761
[19139.28, 20286.16> 7.0 83.0 0.6177
[20286.16, 22743.76> 53.0 169.0 1.896
[22743.76, 23890.64> 13.0 77.0 1.1471
>=23890.64 28.0 71.0 2.2462

Z-Ratio

The Z-Ratio is also a measure of the how the propensity in a bin differs from the average, but takes into account the size of the bin and thus is statistically more relevant. It represents the number of standard deviations from the average, so centers around 0. The wider the spread, the better the predictor is.

\[\frac{posFraction-negFraction}{\sqrt(\frac{posFraction*(1-posFraction)}{\sum positives}+\frac{negFraction*(1-negFraction)}{\sum negatives})}\]

See the calculation here, which is also included in cdh_utils’ zRatio().

[11]:
def z_ratio(
    pos_col: pl.Expr = pl.col("BinPositives"), neg_col: pl.Expr = pl.col("BinNegatives")
) -> pl.Expr:
    def get_fracs(pos_col=pl.col("BinPositives"), neg_col=pl.col("BinNegatives")):
        return pos_col / pos_col.sum(), neg_col / neg_col.sum()

    def z_ratio_impl(
        pos_fraction_col=pl.col("posFraction"),
        neg_fraction_col=pl.col("negFraction"),
        positives_col=pl.sum("BinPositives"),
        negatives_col=pl.sum("BinNegatives"),
    ):
        return (
            (pos_fraction_col - neg_fraction_col)
            / (
                (pos_fraction_col * (1 - pos_fraction_col) / positives_col)
                + (neg_fraction_col * (1 - neg_fraction_col) / negatives_col)
            ).sqrt()
        ).alias("ZRatio")

    return z_ratio_impl(*get_fracs(pos_col, neg_col), pos_col.sum(), neg_col.sum())


GT(
    binning_derived.select(
        "Range/Symbol", "Positives", "Negatives", "Positives %", "Negatives %"
    ).with_columns(z_ratio(positives, negatives).round(4))
).tab_style(
    style=style.text(weight="bold"), locations=loc.body(columns="Range/Symbol")
).tab_style(
    style=style.text(color="blue"), locations=loc.body(columns=["ZRatio"])
).fmt_percent(pl.selectors.ends_with("%")).tab_options(table_margin_left=0)
[11]:
Range/Symbol Positives Negatives Positives % Negatives % ZRatio
<11684.56 13.0 423.0 6.31% 29.58% -11.1869
[11684.56, 13732.56> 24.0 178.0 11.65% 12.45% -0.3321
[13732.56, 16845.52> 17.0 250.0 8.25% 17.48% -4.2647
[16845.52, 19139.28> 51.0 179.0 24.76% 12.52% 3.9082
[19139.28, 20286.16> 7.0 83.0 3.40% 5.80% -1.7118
[20286.16, 22743.76> 53.0 169.0 25.73% 11.82% 4.3976
[22743.76, 23890.64> 13.0 77.0 6.31% 5.38% 0.5156
>=23890.64 28.0 71.0 13.59% 4.97% 3.5129

Predictor AUC

The predictor AUC is the univariate performance of this predictor against the outcome. This too can be derived from the positives and negatives and there is a convenient function in pdstools to calculate it directly from the positives and negatives.

This function is implemented in cdh_utils: cdh_utils.auc_from_bincounts().

[12]:
pos = binning_derived.get_column("Positives")
neg = binning_derived.get_column("Negatives")
probs = binning_derived.get_column("Propensity")
order = probs.arg_sort()
FPR = pl.Series([0.0], dtype=pl.Float32).extend(neg[order].cum_sum() / neg[order].sum())
TPR = pl.Series([0.0], dtype=pl.Float32).extend(pos[order].cum_sum() / pos[order].sum())
if TPR[1] < 1 - FPR[1]:
    FPR, TPR = TPR, FPR

[13]:
pos = binning_derived.get_column("Positives").to_numpy()
neg = binning_derived.get_column("Negatives").to_numpy()
probs = binning_derived.get_column("Propensity").to_numpy()
order = np.argsort(probs)

FPR = np.cumsum(neg[order]) / np.sum(neg[order])
TPR = np.cumsum(pos[order]) / np.sum(pos[order])
TPR = np.insert(TPR, 0, 0, axis=0)
FPR = np.insert(FPR, 0, 0, axis=0)
# Checking whether classifier labels are correct
if TPR[1] < 1 - FPR[1]:
    temp = FPR
    FPR = TPR
    TPR = temp
auc = cdh_utils.auc_from_bincounts(pos=pos, neg=neg, probs=probs)

fig = px.line(
    x=[1 - x for x in FPR],
    y=TPR,
    labels=dict(x="Specificity", y="Sensitivity"),
    title=f"AUC = {auc.round(3)}",
    width=700,
    height=700,
    range_x=[1, 0],
    template="none",
)
fig.add_shape(type="line", line=dict(dash="dash"), x0=1, x1=0, y0=0, y1=1)
fig.show()

Naive Bayes and Log Odds

The basis for the Naive Bayes algorithm is Bayes’ Theorem:

\[p(C_k|x) = \frac{p(x|C_k)*p(C_k)}{p(x)}\]

with \(C_k\) the outcome and \(x\) the customer. Bayes’ theorem turns the question “what’s the probability to accept this action given a customer” around to “what’s the probability of this customer given an action”. With the independence assumption, and after applying a log odds transformation we get a log odds score that can be calculated efficiently and in a numerically stable manner:

\[log\ odds\ score = \sum_{p\ \in\ active\ predictors}log(p(x_p|Positive)) + log(p_{positive}) - \sum_plog(p(x_p|Negative)) - log(p_{negative})\]

note that the prior can be written as:

\[log(p_{positive}) - log(p_{negative}) = log(\frac{TotalPositives}{Total})-log(\frac{TotalNegatives}{Total}) = log(TotalPositives) - log(TotalNegatives)\]

Predictor Contribution

The contribution (conditional log odds) of an active predictor \(p\) for bin \(i\) with the number of positive and negative responses in \(Positives_i\) and \(Negatives_i\) is calculated as (note the “laplace smoothing” to avoid log 0 issues):

\[contribution_p = \log(Positives_i+\frac{1}{nBins}) - \log(Negatives_i+\frac{1}{nBins}) - \log(1+\sum_{i\ = 1..nBins}{Positives_i}) + \log(1+\sum_i{Negatives_i})\]
[14]:
N = binning_derived.shape[0]
GT(
    binning_derived.with_columns(
        LogOdds=(pl.col("Positives %") / pl.col("Negatives %")).log().round(5),
        ModifiedLogOdds=(
            ((positives + 1 / N).log() - (positives.sum() + 1).log())
            - ((negatives + 1 / N).log() - (negatives.sum() + 1).log())
        ).round(5),
    ).drop("Responses %", "Propensity")
).tab_style(
    style=style.text(weight="bold"), locations=loc.body(columns="Range/Symbol")
).tab_style(
    style=style.text(color="blue"),
    locations=loc.body(columns=["LogOdds", "ModifiedLogOdds"]),
).fmt_percent(pl.selectors.ends_with("%")).tab_options(table_margin_left=0)
[14]:
Range/Symbol Positives Negatives Positives % Negatives % LogOdds ModifiedLogOdds
<11684.56 13.0 423.0 6.31% 29.58% -1.54487 -1.53974
[11684.56, 13732.56> 24.0 178.0 11.65% 12.45% -0.06618 -0.06583
[13732.56, 16845.52> 17.0 250.0 8.25% 17.48% -0.75069 -0.74801
[16845.52, 19139.28> 51.0 179.0 24.76% 12.52% 0.68199 0.6796
[19139.28, 20286.16> 7.0 83.0 3.40% 5.80% -0.53538 -0.52333
[20286.16, 22743.76> 53.0 169.0 25.73% 11.82% 0.77795 0.77542
[22743.76, 23890.64> 13.0 77.0 6.31% 5.38% 0.1587 0.1625
>=23890.64 28.0 71.0 13.59% 4.97% 1.00708 1.00563

Propensity mapping

Log odds contribution for all the predictors

The final score is loosely referred to as “the average contribution” but in fact is a little more nuanced. The final score is calculated as:

\[score = \frac{\log(1 + TotalPositives) – \log(1 + TotalNegatives) + \sum_p contribution_p}{1 + nActivePredictors}\]

Here, \(TotalPositives\) and \(TotalNegatives\) are the total number of positive and negative responses to the model.

Below an example. From all the active predictors of the model we pick a value (in the middle for numerics, first symbol for symbolics) and show the (modified) log odds. The final score is calculated per the above formula, and this is the value that is mapped to a propensity value by the classifier (which is constructed using the PAV(A) algorithm).

[15]:
PredictorName Value Bin Positives Negatives LogOdds
Customer.Age 34.56 4 9.0 198.0 -1.1459
Customer.AnnualIncome -24043.049 1 74.0 1166.0 -0.8197
Customer.BusinessSegment middleSegmentPlus 1 96.0 970.0 -0.3764
Customer.CLV NON-MISSING 1 111.0 570.0 0.3009
Customer.CLV_VALUE 1345.52 4 31.0 297.0 -0.3227
Customer.CreditScore 518.92 3 33.0 205.0 0.1105
Customer.Date_of_Birth 18773.504 5 28.0 152.0 0.2446
Customer.Gender U 1 52.0 481.0 -0.2855
Customer.MaritalStatus No Resp+ 1 67.0 745.0 -0.4708
Customer.NetWealth 17992.398 4 51.0 179.0 0.6796
Customer.NoOfDependents 0.0 1 111.0 850.0 -0.0997
Customer.Prefix Mrs. 1 64.0 552.0 -0.2167
Customer.RelationshipStartDate 1426.4596 4 16.0 117.0 -0.0502
Customer.RiskCode R4 1 36.0 329.0 -0.2709
Customer.WinScore 66.600006 4 39.0 102.0 0.9738
Customer.pyCountry USA 1 99.0 776.0 -0.1227
IH.Email.Outbound.Accepted.pxLastGroupID HomeLoans 3 25.0 218.0 -0.2272
IH.Email.Outbound.Accepted.pxLastOutcomeTime.DaysSince -55.88436 2 145.0 881.0 0.1305
IH.Email.Outbound.Accepted.pyHistoricalOutcomeCount 1.5 2 30.0 351.0 -0.5201
IH.Email.Outbound.Churned.pyHistoricalOutcomeCount None 1 143.0 898.0 0.1015
IH.Email.Outbound.Loyal.pxLastOutcomeTime.DaysSince None 1 129.0 1071.0 -0.1788
IH.Email.Outbound.Rejected.pyHistoricalOutcomeCount 83.16 3 24.0 218.0 -0.2678
IH.SMS.Outbound.Accepted.pxLastGroupID Account 4 45.0 316.0 -0.0133
IH.SMS.Outbound.Accepted.pyHistoricalOutcomeCount 9.02 4 6.0 96.0 -0.822
IH.SMS.Outbound.Churned.pxLastOutcomeTime.DaysSince -20.5537 2 9.0 27.0 0.8492
IH.SMS.Outbound.Loyal.pxLastOutcomeTime.DaysSince None 1 165.0 1240.0 -0.0797
IH.SMS.Outbound.Loyal.pyHistoricalOutcomeCount None 1 165.0 1240.0 -0.0797
IH.SMS.Outbound.Rejected.pxLastGroupID Account 2 47.0 357.0 -0.0905
IH.SMS.Outbound.Rejected.pyHistoricalOutcomeCount 102.72 4 12.0 117.0 -0.3356
IH.Web.Inbound.Accepted.pxLastGroupID DepositAccounts 3 53.0 397.0 -0.0779
IH.Web.Inbound.Accepted.pyHistoricalOutcomeCount 11.04 5 25.0 164.0 0.0558
IH.Web.Inbound.Loyal.pxLastGroupID MISSING 1 100.0 857.0 -0.2119
IH.Web.Inbound.Loyal.pyHistoricalOutcomeCount 4.52 3 30.0 212.0 -0.0172
IH.Web.Inbound.Rejected.pxLastGroupID Account 2 81.0 546.0 0.0279
IH.Web.Inbound.Rejected.pyHistoricalOutcomeCount 111.08 4 35.0 306.0 -0.2317
Param.ExtGroupCreditcards NON-MISSING 1 136.0 721.0 0.2684
Final Score None None None None -0.14933270233219137

Classifier

The success rate is defined as \(\frac{positives}{positives+negatives}\) per bin.

The adjusted propensity that is returned is a small modification (Laplace smoothing) to this and calculated as \(\frac{0.5+positives}{1+positives+negatives}\) so empty models return a propensity of 0.5.

[16]:
Index Bin Positives Negatives Cum. Total (%) Propensity (%) Adjusted Propensity (%) Cum Positives (%) ZRatio Lift(%)
1 <-0.21 17.0 443.0 28.12% 3.70% 3.80% 8.25% -9.9945 1.96%
2 [-0.21, -0.185> 8.0 133.0 36.74% 5.67% 5.99% 12.14% -3.4954 3.00%
3 [-0.185, -0.175> 3.0 48.0 39.85% 5.88% 6.73% 13.59% -1.9775 3.11%
4 [-0.175, -0.105> 28.0 370.0 64.18% 7.04% 7.14% 27.18% -4.6281 3.72%
5 [-0.105, -0.095> 4.0 51.0 67.54% 7.27% 8.04% 29.13% -1.5054 3.85%
6 [-0.095, -0.09> 2.0 19.0 68.83% 9.52% 11.36% 30.10% -0.4788 5.04%
7 [-0.09, -0.065> 9.0 77.0 74.08% 10.47% 10.92% 34.47% -0.6578 5.54%
8 [-0.065, -0.02> 30.0 154.0 85.33% 16.30% 16.49% 49.03% 1.4644 8.63%
9 [-0.02, 0.03> 37.0 65.0 91.56% 36.27% 36.41% 66.99% 4.913 19.21%
10 [0.03, 0.06> 20.0 29.0 94.56% 40.82% 41.00% 76.70% 3.664 21.61%
11 [0.06, 0.12> 30.0 33.0 98.41% 47.62% 47.66% 91.26% 4.9229 25.21%
12 [0.12, 0.125> 2.0 2.0 98.66% 50.00% 50.00% 92.23% 1.2039 26.47%
13 [0.125, 0.13> 4.0 2.0 99.02% 66.67% 64.29% 94.17% 1.8644 35.30%
14 [0.13, 0.995> 8.0 3.0 99.69% 72.73% 70.83% 98.06% 2.7182 38.51%
15 >=0.995 4.0 1.0 100.00% 80.00% 75.00% 100.00% 1.9418 42.36%

Final Propensity

Below the classifier mapping. On the x-axis the binned scores (log odds values), on the y-axis the Propensity. Note the returned propensities are following a slightly adjusted formula, see the table above. The bin that contains the calculated final score is highlighted.

[17]:
score = propensity_mapping.filter(PredictorName="Final Score")["LogOdds"][0]
score_bin = (
    modelpredictors.filter(pl.col("EntryType") == "Classifier")
    .select(
        pl.col("BinSymbol").filter(
            pl.lit(score).is_between(pl.col("BinLowerBound"), pl.col("BinUpperBound"))
        )
    )
    .item()
)
score_responses = modelpredictors.filter(
    (pl.col("EntryType") == "Classifier") & (pl.col("BinSymbol") == score_bin)
)["BinResponseCount"][0]

score_bin_index = (
    modelpredictors.filter(pl.col("EntryType") == "Classifier")["BinSymbol"]
    .to_list()
    .index(score_bin)
)

score_propensity = classifier.row(score_bin_index, named=True)[
    "Adjusted Propensity (%)"
]

adjusted_propensity = (
    modelpredictors.filter(pl.col("EntryType") == "Classifier")
    .with_columns(
        AdjustedPropensity=(
            (0.5 + BinPositives) / (1 + BinPositives + BinNegatives)
        ).round(5),
    )
    .select(
        pl.col("AdjustedPropensity").filter(
            (pl.col("BinLowerBound") < score) & (pl.col("BinUpperBound") > score)
        )
    )["AdjustedPropensity"][0]
)
fig = dm.plot.score_distribution(
    model_id=modelpredictors.get_column("ModelID").unique().to_list()[0]
).add_annotation(
    x=score_bin,
    y=score_propensity / 100,
    text=f"Returned propensity: {score_propensity*100:.2f}%",
    bgcolor="#FFFFFF",
    bordercolor="#000000",
    showarrow=False,
    yref="y2",
    opacity=0.7,
)
bin_index = list(fig.data[0]["x"]).index(score_bin)
fig.data[0]["marker_color"] = (
    ["grey"] * bin_index
    + ["#1f77b4"]
    + ["grey"] * (classifier.shape[0] - bin_index - 1)
)
fig

Feature Importance

Feature importance in Naive Bayes models represents how strongly a predictor differentiates between positive and negative outcomes. It measures the average magnitude of log odds contributions across the predictor’s bins.

A predictor with high importance has bins with strong log odds values (far from zero), indicating strong predictive power. A predictor with low importance has bins with weak log odds values (close to zero), indicating weak predictive power.

Formula

For a predictor with bins \(i = 1..n\):

Step 1: Log odds per bin (with Laplace smoothing \(\frac{1}{n}\)):

\[\text{LogOdds}_i = \log\left(Pos_i + \frac{1}{n}\right) - \log\left(\sum Pos + 1\right) - \left[\log\left(Neg_i + \frac{1}{n}\right) - \log\left(\sum Neg + 1\right)\right]\]

Step 2: Absolute log odds per bin:

\[|\text{LogOdds}_i|\]

Step 3: Feature importance (weighted average of absolute log odds):

\[\text{Importance} = \frac{\sum_i |\text{LogOdds}_i| \times Responses_i}{\sum_i Responses_i}\]

Step 4: Optional scaling (default):

\[\text{Scaled Importance} = \frac{\text{Importance}}{\max(\text{Importance})} \times 100\]

Interpretation

  • Higher values: Predictor has strong log odds values (strong predictive power)

  • Lower values: Predictor has weak log odds values (weak predictive power)

  • Zero: All bins have zero log odds (predictor adds no information)

The absolute log odds captures the magnitude of each bin’s contribution to the model score, weighted by how many responses are in that bin. This matches the Pega platform implementation.

[18]:
# Calculate feature importance for all active predictors
predictor_importance = (
    modelpredictors.filter(pl.col("PredictorName") != "Classifier")
    .with_columns(
        # Calculate both unscaled and scaled importance
        cdh_utils.feature_importance(scaled=False).alias("Importance"),
        cdh_utils.feature_importance(scaled=True).alias("ScaledImportance"),
    )
    .group_by("PredictorName", "ModelID")
    .agg(
        pl.first("Importance"),
        pl.first("ScaledImportance")
    )
    .sort("ScaledImportance", descending=True)
)

GT(predictor_importance.drop("ModelID")).tab_header("Feature Importance by Predictor").fmt_number(
    ["Importance", "ScaledImportance"], decimals=4
)
[18]:
Feature Importance by Predictor
PredictorName Importance ScaledImportance
Customer.AnnualIncome 0.9043 100.0000
Customer.NetWealth 0.8399 92.8802
Customer.WinScore 0.6859 75.8537
Customer.CreditScore 0.6494 71.8165
Customer.RelationshipStartDate 0.5316 58.7896
Customer.Age 0.5054 55.8905
Customer.CLV_VALUE 0.4982 55.0939
IH.Web.Inbound.Accepted.pyHistoricalOutcomeCount 0.4781 52.8755
Customer.MaritalStatus 0.4281 47.3463
IH.SMS.Outbound.Accepted.pyHistoricalOutcomeCount 0.4277 47.2921
Customer.BusinessSegment 0.4219 46.6575
IH.Email.Outbound.Accepted.pyHistoricalOutcomeCount 0.3411 37.7255
Param.ExtGroupCreditcards 0.3194 35.3226
IH.Web.Inbound.Rejected.pyHistoricalOutcomeCount 0.3132 34.6362
IH.Email.Outbound.Rejected.pyHistoricalOutcomeCount 0.3114 34.4356
Customer.CLV 0.2799 30.9571
IH.Email.Outbound.Accepted.pxLastGroupID 0.2794 30.9015
IH.SMS.Outbound.Rejected.pyHistoricalOutcomeCount 0.2398 26.5162
Customer.Date_of_Birth 0.2379 26.3076
IH.Email.Outbound.Loyal.pxLastOutcomeTime.DaysSince 0.2332 25.7930
Customer.Gender 0.2291 25.3329
IH.Web.Inbound.Loyal.pxLastGroupID 0.2273 25.1357
IH.Web.Inbound.Loyal.pyHistoricalOutcomeCount 0.2249 24.8665
IH.SMS.Outbound.Accepted.pxLastGroupID 0.2133 23.5931
Customer.RiskCode 0.2130 23.5587
IH.Email.Outbound.Accepted.pxLastOutcomeTime.DaysSince 0.2045 22.6120
IH.Web.Inbound.Accepted.pxLastGroupID 0.1761 19.4700
Customer.Prefix 0.1520 16.8109
IH.SMS.Outbound.Loyal.pxLastOutcomeTime.DaysSince 0.1466 16.2150
IH.Email.Outbound.Churned.pyHistoricalOutcomeCount 0.1393 15.4097
IH.SMS.Outbound.Loyal.pyHistoricalOutcomeCount 0.1256 13.8861
Customer.pyCountry 0.1250 13.8187
IH.Web.Inbound.Rejected.pxLastGroupID 0.1154 12.7667
Customer.NoOfDependents 0.1134 12.5364
IH.SMS.Outbound.Churned.pxLastOutcomeTime.DaysSince 0.1052 11.6307
IH.SMS.Outbound.Rejected.pxLastGroupID 0.1007 11.1397
[19]:
# Visualize feature importance
fig = px.bar(
    predictor_importance,
    x="PredictorName",
    y="ScaledImportance",
    title="Predictor Feature Importance (Scaled 0-100)",
    labels={"ScaledImportance": "Importance", "PredictorName": ""},
    template="pega",
)
fig.update_layout(
    xaxis_tickangle=-45,
    height=500,
    margin=dict(b=180)  # Increase bottom margin for longer rotated labels
)
fig.update_xaxes(tickfont=dict(size=10))  # Smaller font for x-axis labels
fig.show()