Value Finder analysis¶
Every Value Finder simulation populates a dataset, the pyValueFinder dataset. This dataset contains a lot more information than is what is currently presented on screen.
The data held in this dataset can be analysed to uncover insights into your decision framework. This notebook provides a sample analysis of the Value Finder simulation results.
In the data folder we’ve stored a copy of such a dataset, generated from an (internal) demo application (CDHSample). To run this notebook on your own data, you should export the pyValueFinder dataset from Dev Studio then follow the instructions below.
This is how Value Finder results of the sample data are presented in Pega (8.6, it may look different in other versions):
For the sample provided, the relevant action setting is 1.2%. There are 10.000 customers, 3491 without actions, 555 with only irrelevant actions and 5954 with at least one relevant action.
PDSTools defines a class ValueFinder that wraps the operations on this dataset. The “datasets” import is used for the example but you won’t need this if you load your own Value Finder dataset.
Just like with the ADMDatamart class, you can supply your own path and filename as such:
vf = ValueFinder(path = '[PATH TO DATA]', filename="[NAME OF DATASET EXPORT]")
If only a path is supplied, it will automatically look for the latest file.
It is also possible to supply a dataframe as the ‘df’ argument directly, in which case it will use that instead.
[2]:
from pdstools import ValueFinder, datasets
import polars as pl
# vf = ValueFinder(path = '...', filename='...')
vf = datasets.sample_value_finder()
When reading the data, we filter out unnecessary data, and the result is kept in the df
property:
[3]:
vf.df.head(5).collect()
[3]:
Component | BundleType | Organization | Contextweight | InteractionID | PreviousComponent | Agentcompensation | EligibilityDescription | StartDate | Strategy | ActionContext | ClickthroughURL | WhyRelevant | Treatment | Issue | DecisionTime | SMSMessage | Segment | DynamicTemplateIssue | WorkID | SubjectID | Journey | CustomerID | AdjustedPropensityTreatment | Unit | TreatmentKeycode | DecisionReference | UpdateopName | ModelControlGroup | Interaction | Weight | Keycode | Name | ModelPerformance | TemplateType | Quantity | Adminputsource | … | Group | SubjectHash | Priority_2 | Channel | StartingPropensity | Division | Updateoperator | DynamicTemplateGroup | Applyanalytics | OriginalSubjectID | UniqueID | SubjectType | ModelPropensity | StartingEvidence | IsSMSEnabled | Direction | ImageURL | Istransactional | Application | ModelEvidence | IsEmailEnabled | Weight_2 | DeliverOffline | MaturityCap | Label | TemplateName | CustomerHash | Stage | FinalPropensity | Bundlehead | IsWebEnabled | BundleName | SpecifyEmailSubject | EmailSubject | Communicateto | IsPaidEnabled | Cost |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
str | str | str | str | str | str | str | str | str | str | str | str | str | str | cat | datetime[ns] | str | str | str | str | str | str | str | f64 | str | str | str | str | str | str | f64 | str | str | str | str | i64 | str | … | cat | str | f64 | cat | str | str | str | str | bool | str | str | cat | f64 | u32 | bool | cat | str | bool | str | i64 | bool | f64 | bool | f64 | str | str | i64 | enum | f64 | bool | bool | str | bool | str | str | bool | f64 |
"Customers" | "FixedBundle" | "DMOrg" | null | "6320879002230024255" | "Customers" | "2" | "For students ages 17-24, with … | "20160201T163800.000 GMT" | "Opp_NBA_All" | "Customer" | "/MS/index.html?Site=NonBundled… | "For students under age 23 are … | "Deposit_SMSTreatment" | "Sales" | 2021-08-24 11:12:49.350 | "" | null | "Sales" | "Opp_NBA_AlDF_Simulat-1" | "Customer-1" | null | "Customer-1" | 0.278077 | "Unit" | "" | " 6320879002230024255" | "Piyush Vashisht" | "Test" | "——" | null | "SDSC2" | "StudentChecking" | "0.5" | "" | null | "modelReferences" | … | "DepositAccounts" | "0.4388105018174723" | null | "SMS" | "0.5" | "Div" | "vashp" | "CreditCards" | true | "Customer-1" | null | "CDHSample-Data-Customer" | 0.269231 | 100 | true | "Outbound" | "web/DepositAccountOffer.png" | false | "CDHSample" | 96 | true | null | false | 0.1768 | "Student Checking" | "" | 10 | "Applicability" | 0.278077 | false | null | null | null | null | null | null | null |
"Customers" | "FixedBundle" | "DMOrg" | null | "6320879002230024184" | "Customers" | null | null | null | "Opp_NBA_All" | "Customer" | null | null | "Mobile_ExternalSMSTreatment" | "Usage" | 2021-08-24 11:12:37.780 | "You are entitled for an offer" | null | "Usage" | "Opp_NBA_AlDF_Simulat-1" | "Customer-100" | null | "Customer-100" | 0.713095 | "Unit" | "MOBILE" | " 6320879002230024184" | "Piyush Vashisht" | "Test" | "——" | null | "UMGUMA1" | "GetTheUMobileApp" | "0.5" | "DB" | null | "modelReferences" | … | "Mobilebanking" | "0.940830237245977" | null | "SMS" | "1" | "Div" | "vashp" | "Mobilebanking" | true | "Customer-100" | null | "CDHSample-Data-Customer" | 0.5 | 25 | true | "Outbound" | null | false | "CDHSample" | 0 | true | null | true | 0.02 | "Get the U+ Mobile App" | "Mobile_SMSDBTemplate" | 1 | "Applicability" | 0.713095 | null | null | null | null | null | null | null | null |
"Customers" | "FixedBundle" | "DMOrg" | null | "6320879002230023934" | "Customers" | null | null | null | "Opp_NBA_All" | "Customer" | null | null | "Recommendations_SMSTreatment" | "Collections" | 2021-08-24 11:12:02.932 | "" | null | "Collections" | "Opp_NBA_AlDF_Simulat-1" | "Customer-1000" | null | "Customer-1000" | 0.421306 | "Unit" | "" | " 6320879002230023934" | "Piyush Vashisht" | "Test" | "——" | null | "CRSAT1" | "SetupAutopayToday" | "0.5" | "" | null | "modelReferences" | … | "Recommendations" | "0.8026197582592971" | null | "SMS" | "1" | "Div" | "vashp" | "Recommendations" | true | "Customer-1000" | null | "CDHSample-Data-Customer" | 0.5 | 25 | true | "Outbound" | null | false | "CDHSample" | 0 | null | null | false | 0.02 | "Setup Autopay today" | "" | 0 | "Applicability" | 0.421306 | null | true | null | null | null | null | null | null |
"Customers" | "FixedBundle" | "DMOrg" | null | "6320879002230023873" | "Customers" | "2" | "For students ages 17-24, with … | "20160201T163800.000 GMT" | "Opp_NBA_All" | "Customer" | "/MS/index.html?Site=NonBundled… | "For students under age 23 are … | "Deposit_SMSTreatment" | "Sales" | 2021-08-24 11:11:52.857 | "" | null | "Sales" | "Opp_NBA_AlDF_Simulat-1" | "Customer-10000" | null | "Customer-10000" | 0.244777 | "Unit" | "" | " 6320879002230023873" | "Piyush Vashisht" | "Test" | "——" | null | "SDSC2" | "StudentChecking" | "0.5" | "" | null | "modelReferences" | … | "DepositAccounts" | "0.2919499647730856" | null | "SMS" | "0.5" | "Div" | "vashp" | "CreditCards" | true | "Customer-10000" | null | "CDHSample-Data-Customer" | 0.269231 | 100 | true | "Outbound" | "web/DepositAccountOffer.png" | false | "CDHSample" | 96 | true | null | false | 0.1768 | "Student Checking" | "" | 10 | "Applicability" | 0.244777 | false | null | null | null | null | null | null | null |
"Customers" | "FixedBundle" | "DMOrg" | null | "6320879002230023923" | "Customers" | null | null | null | "Opp_NBA_All" | "Customer" | null | null | "Bundles_SMSTreatment" | "Sales" | 2021-08-24 11:12:01.129 | "" | null | "Sales" | "Opp_NBA_AlDF_Simulat-1" | "Customer-1001" | null | "Customer-1001" | 0.24831 | "Unit" | "" | " 6320879002230023923" | "Piyush Vashisht" | "Test" | "——" | null | "SBSC1" | "StudentChoice" | "0.6029929577464789" | "" | null | "modelReferences" | … | "Bundles" | "0.45582757506999766" | null | "SMS" | "0.5" | "Div" | "vashp" | "CreditCards" | true | "Customer-1001" | null | "CDHSample-Data-Customer" | 0.15 | 100 | true | "Outbound" | null | false | "CDHSample" | 91 | null | null | false | 0.1082 | "Student Choice" | "" | 5 | "Applicability" | 0.24831 | null | true | "Student Choice" | null | null | null | null | null |
The piechart shown in platform is based on a propensity threshold. For the sample data, this threshold follows from a propensity quantile of 5.2%.
The plot.pie_charts
function shows the piecharts for all of the stages in the engagement policies (in platform you only see the last one) and calculates the threshold automatically. You can also give the threshold explicitly.
[4]:
vf.plot.pie_charts()
vf.plot.pie_charts(quantiles=[0.052])
Hover over the charts to see the details. For the sample data, the rightmost pie chart corresponds to the numbers in Pega as shown in the screenshot above.
Red = customers not receiving any action
Yellow = customers not receiving any “relevant” actions, sometimes also called “under served”
Green = customers that receive at least one “relevant” action, sometimes also called “well served”
With “relevant” defined as having a propensity above the threshold. This defaults to the 5th percentile.
Insights into the propensity distribution per stage is crucial. We can plot this distribution with plot.propensity_threshold
. You often see a spike at 0.5, which corresponds to models w/o responses (their propensity defaults to 0.5/1 = 0.5).
The dotted vertical line represents the computed threshold.
[5]:
vf.plot.propensity_threshold()
These different propensities represent
pyModelPropensity = the actual propensities from the models
pyPropensity = model or random propensity, depending on the ModelControl (or, when models are executed from an extension point after the standard Predictions, their propensity, but such a configuration is not supported by Value Finder)
FinalPropensity = the propensity after possible adjustments by Thompson Sampling; Thompson Sampling basically smoothes the propensities, you would expect any peak at 0.5 caused by empty models to be smoothed out
We can also look at the propensity distributions across the different stages. This is based on the model propensities, not any of the subsequent overrides:
[6]:
vf.plot.propensity_distribution()
The effect of the selection of the propensity threshold on the number of actions for a customer can be simulated by supplying a list of either quantiles or propensities to the plot.pie_charts()
function. This will generate the aggregated counts per stage, which we can plot as such:
[7]:
import numpy as np
vf.plot.pie_charts(quantiles=np.arange(0.01, 1, 0.01))
The further to the left you put the slider threshold, the more “green” you will see. As you raise the threshold, more customers will be reported as getting “not relevant” actions.
The same effect can also be visualized in a funnel. Use plot.distribution_per_threshold()
to show the threshold on the x-axis. Again, you can pass a list of quantiles or thresholds to plot custom values here.
[8]:
vf.plot.distribution_per_threshold()
[9]:
vf.plot.distribution_per_threshold(quantiles=np.arange(0.01, 1, 0.01))
You can zoom in into how individual actions are distributed across the stages. There usually are very many actions so this typically requires to zoom in into one particular group, issue etc.
In the sample data, we can filter to just the Sales actions as shown with the ‘query’
functionality below (and this snippet may not work when using your own data if there is no Sales issue).
Use the plot.funnel_chart()
function for an overview of this funnel effect throughout the stages. As a rule of thumb, if there are only a few actions in each stage, this is not a good sign. If certain actions are completely filtered out from one stage to the next, it may also be a warning of strong filtering.
[10]:
vf.plot.funnel_chart("Name", query=pl.col("Issue") == "Sales")
The above chart shows the funnel effect at the level of individual Actions. You may want to start more course-grained as shown below, by setting the level
parameters as 'Issue'
:
[11]:
vf.plot.funnel_chart("Issue")
Or just the groups for the Sales issue (again: this example may not work when using your own dataset if there is no Sales issue):
[12]:
vf.plot.funnel_chart("Group", query=pl.col("Issue") == "Sales")