Example IH Analysis

Interaction History (IH) is a rich source of data at the level of individual interactions from Pega DSM applications like CDH. It contains the time of the interaction, the channel, the actions/treatments, the customer ID and is used to track different types of outcomes (decisions, sends, opens, clicks, etc). It does not contain details of individual customers - only their ID’s.

Interaction History is typically used to analyze customer behavior and optimize decision strategies. The following sections provide various example analyses that can be performed on IH data, including distribution analysis, response analysis, success rates, model performance, propensity distribution, and response time analysis.

Like most of PDSTools, it uses plotly for visualization and polars (dataframe) but the purpose of this Notebook is more to serve example analyses than re-usable code, although of course we already provide some generic, re-usable functions. All of the analyses should be able to be replicated easily in other analytical BI environments - except perhaps the analysis of model performance / AUC.

This notebook uses mock data generated by PDS Tools. Replace it with your own actual IH data and modify the analyses as appropriate.

[2]:
from pdstools import IH
import polars as pl
# ih = IH.from_ds_export(
#     "../../data/Data-pxStrategyResult_pxInteractionHistory_20210101T010000_GMT.zip"
# )
ih = IH.from_mock_data(n=1e5)

Preview of the raw IH data

[3]:
ih.data.head().collect()
[3]:
shape: (5, 13)
InteractionIDChannelIssueGroupNameTreatmentExperimentGroupModelTechniqueOutcomeTimeDirectionOutcomeBasePropensityPropensity
strstrstrstrstrstrstrstrdatetime[μs]strstrf64f64
"1000000000""Web""Risk""Savings""Savings_3""Savings_3_WebTreatment1""Conversion-Test""NaiveBayes"2025-07-16 14:31:42.865343"Inbound""Impression"0.0289970.031488
"1000000001""Web""Risk""Mortgages""Mortgages_6""Mortgages_6_WebTreatment2""Conversion-Test""GradientBoost"2025-07-16 14:30:25.105343"Inbound""Impression"0.0093540.008834
"1000000002""Email""Risk""Savings""Savings_4""Savings_4_EmailTreatment2""Conversion-Control""NaiveBayes"2025-07-16 14:29:07.345343"Outbound""Pending"0.0131340.011579
"1000000003""Email""Retention""Savings""Savings_6""Savings_6_EmailTreatment1""Conversion-Control""GradientBoost"2025-07-16 14:27:49.585343"Outbound""Pending"0.006880.00664
"1000000004""Email""Risk""Investments""Investments_7""Investments_7_EmailTreatment2""Conversion-Test""NaiveBayes"2025-07-16 14:26:31.825343"Outbound""Pending"0.0039050.003202

The same interaction can occur multiple times: once when the first decision is made, then later when responses are captured (accepted, sent, clicked, etc.). For some of the analyses we need to group the positive outcomes by interaction. This is how that data looks like:

[4]:
ih.aggregates.summarize_by_interaction(by=["Channel"]).head().collect()
[4]:
shape: (5, 6)
ChannelInteractionIDInteraction_Outcome_EngagementInteraction_Outcome_ConversionInteraction_Outcome_OpenRatePropensity
strstrboolboolboolf64
"Email""1000017223"falsefalsefalse0.020036
"Email""1000069313"falsefalsefalse0.006585
"Email""1000014088"falsefalsefalse0.003665
"Web""1000079534"truefalsefalse0.020034
"Email""1000082437"falsefalsefalse0.003951

Distribution Analysis

A distribution of the offers (actions/treatments) is often the most obvious type of analysis. You can do an action distribution for specific outcomes (what is offered, what is accepted), view it conditionally (what got offered last month vs this month) - possibly with a delta view, or over time.

[5]:
ih.plot.response_count_tree_map()
ALLPendingImpressionClickedConversionAcceptedOutboundInboundInboundInboundOutboundOutboundEmailWebWebWebEmailEmailRiskRetentionAcquisitionServiceRetentionAcquisitionRiskService SavingsInsuranceMortgagesInvestmentsLendingPensionSavingsInsuranceInvestmentsMortgagesLendingPensionInsuranceSavingsInvestmentsMortgagesLendingPensionInsuranceSavingsInvestmentsMortgagesLendingPensionSavingsInsuranceMortgagesInvestmentsPensionLendingSavingsInsuranceMortgagesInvestmentsPensionLendingInsuranceSavingsMortgagesInvestmentsPensionLendingInsuranceSavingsMortgagesInvestmentsLendingPension Savings_156918%Savings_249216%Savings_345114%Savings_442914%Savings_534411%Savings_630810%Savings_72247%Savings_81836%Savings_9983%Savings_10572%Insurance_156318%Insurance_248116%Insurance_346515%Insurance_439613%Insurance_535512%Insurance_62769%Insurance_72157%Insurance_81806%Insurance_9983%Insurance_10512%Mortgages_139619%Mortgages_237218%Mortgages_329314%Mortgages_427913%Mortgages_522911%Mortgages_61979%Mortgages_71306%Mortgages_81135%Mortgages_9623%Mortgages_10422%Investments_135317%Investments_234817%Investments_428314%Investments_328114%Investments_524912%Investments_619910%Investments_71568%Investments_81055%Investments_9713%Investments_10301%Lending_122620%Lending_218016%Lending_315614%Lending_414813%Lending_511810%Lending_6847%Lending_7827%Lending_8646%Lending_9474%Lending_10192%Pension_118419%Pension_215015%Pension_414014%Pension_312613%Pension_511211%Pension_610811%Pension_7737%Pension_8465%Pension_9343%Pension_10141%Savings_160919%Savings_249516%Savings_444214%Savings_342814%Savings_530610%Savings_62688%Savings_72548%Savings_81695%Savings_91344%Savings_10632%Insurance_161720%Insurance_251016%Insurance_344414%Insurance_439412%Insurance_537612%Insurance_62759%Insurance_72077%Insurance_81625%Insurance_91134%Insurance_10552%Investments_139119%Investments_232115%Investments_331615%Investments_425112%Investments_523611%Investments_61889%Investments_71618%Investments_81216%Investments_9854%Investments_10332%Mortgages_139019%Mortgages_234517%Mortgages_328514%Mortgages_426413%Mortgages_523712%Mortgages_61769%Mortgages_71547%Mortgages_81005%Mortgages_9744%Mortgages_10352%Lending_118318%Lending_216016%Lending_315115%Lending_414414%Lending_510911%Lending_6909%Lending_7717%Lending_8697%Lending_9323%Lending_10162%Pension_118618%Pension_217117%Pension_316716%Pension_412612%Pension_510310%Pension_6848%Pension_7737%Pension_8485%Pension_9404%Pension_10263%Insurance_158819%Insurance_253717%Insurance_343214%Insurance_439612%Insurance_536512%Insurance_62749%Insurance_72327%Insurance_81926%Insurance_9983%Insurance_10592%Savings_155718%Savings_247815%Savings_345715%Savings_441713%Savings_536012%Savings_62859%Savings_72217%Savings_81866%Savings_9993%Savings_10592%Investments_136117%Investments_335017%Investments_233816%Investments_426112%Investments_520310%Investments_61859%Investments_71678%Investments_81206%Investments_9774%Investments_10362%Mortgages_136317%Mortgages_234917%Mortgages_330014%Mortgages_426413%Mortgages_523911%Mortgages_61929%Mortgages_71648%Mortgages_81135%Mortgages_9512%Mortgages_10462%Lending_217517%Lending_117116%Lending_315215%Lending_414714%Lending_511911%Lending_610210%Lending_7747%Lending_8515%Lending_9323%Lending_10162%Pension_119219%Pension_216817%Pension_413814%Pension_312913%Pension_511411%Pension_6828%Pension_7788%Pension_8596%Pension_9465% Insurance_155218%Insurance_254017%Insurance_344114%Insurance_537512%Insurance_437012%Insurance_62759%Insurance_72197%Insurance_81806%Insurance_91023%Insurance_10672%Savings_158019%Savings_249416%Savings_342114%Savings_441213%Savings_532010%Savings_629810%Savings_72458%Savings_81665%Savings_91294%Savings_10472%Investments_140219%Investments_234917%Investments_329514%Investments_424312%Investments_524211%Investments_61899%Investments_71467%Investments_81226%Investments_9824%Investments_10412%Mortgages_134717%Mortgages_233717%Mortgages_330915%Mortgages_425513%Mortgages_521211%Mortgages_61829%Mortgages_71568%Mortgages_81045%Mortgages_9744%Mortgages_10241%Lending_118417%Lending_316816%Lending_216616%Lending_413313%Lending_511511%Lending_610510%Lending_8717%Lending_7697%Lending_9313%Lending_10182%Pension_118618%Pension_217717%Pension_315315%Pension_512012%Pension_411812%Pension_6929%Pension_7707%Pension_8454%Pension_9333%Pension_10182%Savings_152717%Savings_349416%Savings_248415%Savings_440813%Savings_536612%Savings_62759%Savings_72388%Savings_81736%Savings_91224%Savings_10431%Insurance_157019%Insurance_252717%Insurance_342214%Insurance_438913%Insurance_532911%Insurance_629910%Insurance_72318%Insurance_81545%Insurance_91104%Insurance_10421%Mortgages_137917%Mortgages_235616%Mortgages_334115%Mortgages_428213%Mortgages_522410%Mortgages_61999%Mortgages_71678%Mortgages_81306%Mortgages_9834%Mortgages_10432%Investments_140619%Investments_233816%Investments_332215%Investments_427613%Investments_521210%Investments_71889%Investments_61758%Investments_81236%Investments_9764%Investments_10382%Pension_118017%Pension_217917%Pension_313313%Pension_412012%Pension_510610%Pension_7899%Pension_6848%Pension_8677%Pension_9434%Pension_10283%Lending_119519%Lending_218218%Lending_314514%Lending_412112%Lending_510310%Lending_6818%Lending_7707%Lending_8566%Lending_9454%Lending_10202%Savings_157318%Savings_252717%Savings_345014%Savings_438612%Savings_538512%Savings_62989%Savings_72377%Savings_81525%Savings_9993%Savings_10622%Insurance_155918%Insurance_252017%Insurance_346915%Insurance_437912%Insurance_534211%Insurance_62949%Insurance_72287%Insurance_81816%Insurance_91134%Insurance_10652%Mortgages_140119%Mortgages_233016%Mortgages_430515%Mortgages_329014%Mortgages_522911%Mortgages_61949%Mortgages_71397%Mortgages_81085%Mortgages_9784%Mortgages_10281%Investments_138819%Investments_231515%Investments_326713%Investments_526613%Investments_425412%Investments_620410%Investments_71437%Investments_81136%Investments_9683%Investments_10312%Pension_120820%Pension_215715%Pension_314113%Pension_413313%Pension_512112%Pension_610610%Pension_7818%Pension_8525%Pension_9353%Pension_10131%Lending_119118%Lending_216916%Lending_314314%Lending_413513%Lending_511711%Lending_6969%Lending_7707%Lending_8656%Lending_9303%Lending_10222%Insurance_156918%Insurance_250616%Insurance_344414%Insurance_439813%Insurance_533911%Insurance_630810%Insurance_72147%Insurance_81495%Insurance_91224%Insurance_10662%Savings_155418%Savings_253517%Savings_342414%Savings_439413%Savings_537012%Savings_629910%Savings_72177%Savings_81615%Savings_9973%Savings_10562%Mortgages_139018%Mortgages_234716%Mortgages_330715%Mortgages_427813%Mortgages_524311%Mortgages_61798%Mortgages_71537%Mortgages_81015%Mortgages_9794%Mortgages_10392%Investments_137618%Investments_232516%Investments_329414%Investments_427213%Investments_523111%Investments_61668%Investments_71568%Investments_81015%Investments_9764%Investments_10422%Pension_117917%Pension_217317%Pension_314314%Pension_612913%Pension_412612%Pension_510610%Pension_7676%Pension_8545%Pension_9374%Pension_10182%Lending_118918%Lending_216616%Lending_313813%Lending_412913%Lending_511912%Lending_6929%Lending_7919%Lending_8464%Lending_9394%Lending_10212%Insurance_158319%Insurance_250116%Insurance_346615%Insurance_438612%Insurance_533011%Insurance_62849%Insurance_72498%Insurance_81324%Insurance_91124%Insurance_10492%Savings_158419%Savings_249016%Savings_345515%Savings_437012%Savings_531811%Savings_62699%Savings_72087%Savings_81435%Savings_91033%Savings_10622%Mortgages_139919%Mortgages_233416%Mortgages_331215%Mortgages_424712%Mortgages_521410%Mortgages_61879%Mortgages_71698%Mortgages_81186%Mortgages_9653%Mortgages_10352%Investments_135717%Investments_232316%Investments_330515%Investments_427914%Investments_522211%Investments_61939%Investments_71477%Investments_81206%Investments_9713%Investments_10472%Lending_120819%Lending_218817%Lending_315815%Lending_411711%Lending_510610%Lending_61009%Lending_7797%Lending_8575%Lending_9444%Lending_10222%Pension_119518%Pension_216716%Pension_316215%Pension_412912%Pension_511711%Pension_6939%Pension_7787%Pension_8666%Pension_9474%Pension_10192%
[6]:
fig = ih.plot.action_distribution(
    query=pl.col.Outcome.is_in(["Clicked", "Accepted"]),
    title="Distribution of Actions",
    color="Outcome",
)
# fig.update_layout(yaxis=dict(tickmode="linear")) # to show all names
fig.show()
050100150200Investments_10Mortgages_10Pension_9Investments_9Savings_9Lending_6Investments_7Lending_5Investments_5Lending_2Mortgages_6Insurance_7Investments_6Mortgages_3Insurance_6Savings_5Insurance_4Savings_3Insurance_2
OutcomeAcceptedClickedDistribution of ActionsCount

Response Analysis

A simple view of the responses over time shows how many responses are received per day (or any other period).

[7]:
ih.plot.response_count(every="1d")
Apr 202025May 4May 18Jun 1Jun 15Jun 29Jul 13020040060080010001200
OutcomeConversionAcceptedClickedImpressionPendingResponsesCount

Which could be viewed per channel as well:

[8]:
ih.plot.response_count(
    facet="Channel",
    query=pl.col.Channel != "",
)
Apr 202025May 4May 18Jun 1Jun 15Jun 29Jul 1302004006000200400600
OutcomeConversionAcceptedClickedImpressionPendingResponsesCountCountWebEmail

Success Rates

Success rates (accept rate, open rate, conversion rate) are interesting to track over time. In addition you may want to split by e.g. Channel, or contrast the rates for different experimental setups in an A-B testing set-up.

[9]:
ih.plot.success_rate(
    facet="Channel", query=pl.col.Channel.is_not_null() & (pl.col.Channel != "")
)
Apr 202025May 4May 18Jun 1Jun 15Jun 29Jul 130.000%1.000%2.000%3.000%0.000%1.000%2.000%3.000%
ChannelEmailWebSuccess Rates Trend of EngagementWebEmail

Model Performance

Similar to Success Rates: typically viewed over time, likely split by channel, conditioned on variations, e.g. NB vs AGB models.

[10]:
ih.plot.model_performance_trend(by="Channel", every="1w")
Apr 202025May 4May 18Jun 1Jun 15Jun 29Jul 13505560657075
ChannelEmailWebModel Performance over TimePerformance

AGB vs NB analysis

There are different types of ADM models you can use in CDH. This analysis shows the model performance of the (classic) Naive Bayes models vs the new Gradient Boosting models. We split by channel as this often matters.

[11]:
fig = ih.plot.model_performance_trend(
    by="ModelTechnique",
    facet="Channel",
    every="1w",
    title="Model Performance of Naive Bayes vs Gradient Boosting Models",
)
fig.update_layout(legend_title_text="Technique")
fig
Apr 202025May 4May 18Jun 1Jun 15Jun 29Jul 135060708050607080
TechniqueGradientBoostNaiveBayesModel Performance of Naive Bayes vs Gradient Boosting ModelsPerformancePerformanceWebEmail

Propensity Analysis

IH also contains information about the factors that determine the prioritization of the offers: lever values, propensities etc.

Propensity Distribution

Here we show the distribution of the propensities of the offers made. It’s also a first example of a custom analysis not currently supported directly by the PDSTools library. You can see how we access the underlying IH data (ih.data), then aggregate and display it.

[12]:
import plotly.figure_factory as ff

channels = [
    c
    for c in ih.data.select(pl.col.Channel.unique().sort())
    .collect()["Channel"]
    .to_list()
    if c is not None and c != ""
]

plot_data = [
    ih.data.filter(pl.col.Channel == c)
    .select(["Propensity"])
    .collect()["Propensity"]
    .sample(fraction=0.1)
    .to_list()
    for c in channels
]
fig = ff.create_distplot(plot_data, group_labels=channels, show_hist=False)
fig.update_layout(
    title="Propensity Distribution",
    yaxis=dict(showticklabels=False),
    xaxis=dict(title="Propensity", tickformat=".0%"),
    legend_title_text="Channel",
    template="pega",
)
fig
0%1%2%3%4%5%6%
ChannelWebEmailPropensity DistributionPropensity

Propensity Calibrarion

We can verify the accurateness of the propensities generated by Pega vs the actual click-through rates by looking at the click through rates in interaction history data. Although there currently is no direct IH plot to do this, the underlying aggregation functions are generic enough to support this.

The plot shows how propensities calibrate against the click through rates in IH.

[13]:
import plotly.express as px
px.bar(
    # We simply use qcut to get the equal volume bins. The labels are very long/not very readable. To solve
    # for that we could do a qcut on the raw data first, programatically set the labels etc.
    ih.aggregates.summary_success_rates(
        by=[pl.col("Propensity").qcut(10).alias("PropensityBin"), "Channel", "Direction"]
    )
    .collect()
    .unpivot(
        on=["SuccessRate_Engagement"],
        index=["PropensityBin", "Channel", "Direction"],
        variable_name="KPI",
        value_name="CTR",
    ).with_columns(
        Channel = pl.concat_str("Channel","Direction",separator="/")
    ),
    x="PropensityBin",
    y="CTR",
    facet_row="Channel",
    template="pega",
    title="Propensity Calibration"
).update_xaxes(title="").update_yaxes(tickformat=".2%")
(-inf, 0.003999059708713469](0.003999059708713469, 0.005520380210764281](0.005520380210764281, 0.006992997195159136](0.006992997195159136, 0.008756109403398354](0.008756109403398354, 0.01092172332293078](0.01092172332293078, 0.013737564625351407](0.013737564625351407, 0.017580603410958927](0.017580603410958927, 0.02323814232485384](0.02323814232485384, 0.03219243809917671](0.03219243809917671, inf]0.00%1.00%2.00%3.00%4.00%5.00%0.00%1.00%2.00%3.00%4.00%5.00%
Propensity CalibrationCTRCTRChannel=Web/InboundChannel=Email/Outbound

Response Time Analysis

Time is one of the dimensions in IH. Here we take a look at how subsequent responses relate to the original decision. It shows, for example, how much time there typically is between the moment of decision and the click.

This type of analysis is usually part of attribution analysis when considering conversion modeling.

[14]:
import plotly.express as px

outcomes = [
    objective
    for objective in ih.data.select(pl.col.Outcome.unique().sort())
    .collect()["Outcome"]
    .to_list()
    if objective is not None and objective != ""
]
plot_data = (
    ih.data.filter(pl.col.OutcomeTime.is_not_null())
    .group_by("InteractionID")
    .agg(
        [pl.col.OutcomeTime.min().alias("Decision_Time")]
        + [
            pl.col.OutcomeTime.filter(pl.col.Outcome == o).max().alias(o)
            for o in outcomes
        ],
    )
    .collect()
    .unpivot(
        index=["InteractionID", "Decision_Time"],
        variable_name="Outcome",
        value_name="Time",
    )
    .with_columns(Duration=(pl.col.Time - pl.col.Decision_Time).dt.total_seconds())
    .filter(pl.col.Duration > 0)
)

ordered_outcomes = (
    plot_data.group_by("Outcome")
    .agg(Duration=pl.col("Duration").median())
    .sort("Duration")["Outcome"]
    .to_list()
)

fig = px.box(
    plot_data,
    x="Duration",
    y="Outcome",
    color="Outcome",
    template="pega",
    category_orders={"Outcome": ordered_outcomes},
    points=False,
    title="Duration of Responses",
    log_x=True,
)
fig.update_layout(
    xaxis_title="Duration (seconds) with logarithmic scale", yaxis_title=""
)
fig
51002510002510k25100k2ConversionAcceptedClicked
OutcomeClickedAcceptedConversionDuration of ResponsesDuration (seconds) with logarithmic scale

Pattern Analysis

This method uncovers patterns in customer behavior by analyzing the sequences of actions that lead to outcomes like conversions. By calculating Pointwise Mutual Information (PMI), we highlight strong associations between actions in customer journeys.

[15]:
customer_sequences, customer_outcomes, count_actions, count_sequences = ih.get_sequences(
    positive_outcome_label="Conversion",
    outcome_column="Outcome",
    level ="Name",
    customerid_column="InteractionID"
)

sequences = ih.calculate_pmi(count_actions, count_sequences)

sequences_df = ih.pmi_overview(sequences, count_sequences, customer_sequences, customer_outcomes)

sequences_df.head()
[15]:
shape: (5, 6)
SequenceLengthAvg PMIFrequencyUnique freqScore
list[str]i64f64i64i64f64
["Savings_4", "Savings_4"]25.594232317.541
["Insurance_3", "Insurance_3"]24.784393917.525
["Insurance_5", "Insurance_5"]25.296272717.455
["Savings_5", "Savings_5"]25.417252517.435
["Mortgages_1", "Mortgages_1"]25.417252517.435