pdstools.impactanalyzer¶
Submodules¶
Classes¶
Analyze and visualize Impact Analyzer experiment results from Pega CDH. |
Package Contents¶
- class ImpactAnalyzer(raw_data: polars.LazyFrame)¶
Analyze and visualize Impact Analyzer experiment results from Pega CDH.
The ImpactAnalyzer class provides analysis and visualization capabilities for NBA (Next-Best-Action) Impact Analyzer experiments. It processes experiment data from Pega’s Customer Decision Hub to compare the effectiveness of different NBA strategies including adaptive models, propensity prioritization, lever usage, and engagement policies.
Data can be loaded from three sources:
PDC exports via
from_pdc(): Uses pre-aggregated experiment data from PDC JSON exports. Value Lift is copied from PDC data as it cannot be re-calculated from the available numbers.VBD exports via
from_vbd(): Reconstructs experiment metrics from raw VBD Actuals or Scenario Planner Actuals data. Allows flexible time ranges and data selection. Value Lift is calculated from ValuePerImpression.Interaction History via
from_ih(): Loads experiment metrics from Interaction History data. Not yet implemented.
\[\text{Engagement Lift} = \frac{\text{SuccessRate}_{test} - \text{SuccessRate}_{control}}{\text{SuccessRate}_{control}}\]\[\text{Value Lift} = \frac{\text{ValueCapture}_{test} - \text{ValueCapture}_{control}}{\text{ValueCapture}_{control}}\]- Parameters:
raw_data (polars.LazyFrame)
- ia_data¶
The underlying experiment data containing control group metrics.
- Type:
pl.LazyFrame
See also
pdstools.adm.ADMDatamartFor ADM model analysis.
pdstools.ih.IHFor Interaction History analysis.
Examples
>>> from pdstools import ImpactAnalyzer >>> ia = ImpactAnalyzer.from_pdc("impact_analyzer_export.json") >>> ia.overall_summary().collect() >>> ia.plot.overview()
- ia_data: polars.LazyFrame¶
- default_ia_experiments¶
Default experiments mapping experiment names to (control, test) group tuples.
- outcome_labels¶
Mapping of metric names to outcome labels used for aggregation.
- default_ia_controlgroups¶
- plot¶
- classmethod from_pdc(pdc_source: os.PathLike | str | List[os.PathLike] | List[str], *, reader: Callable | None = None, query: pdstools.utils.types.QUERY | None = None, return_wide_df: bool = False, return_df: bool = False) ImpactAnalyzer | polars.LazyFrame¶
Create an ImpactAnalyzer instance from PDC JSON export(s).
Loads pre-aggregated experiment data from Pega Decision Central JSON exports. Value Lift metrics are copied directly from the PDC data.
- Parameters:
pdc_source (Union[os.PathLike, str, List[os.PathLike], List[str]]) – Path to PDC JSON file, or a list of paths to concatenate.
reader (Optional[Callable], optional) – Custom function to read source data into a dict. If None, uses standard JSON file reader. Default is None.
query (Optional[QUERY], optional) – Polars expression to filter the data. Default is None.
return_wide_df (bool, optional) – If True, return the raw wide-format data as a LazyFrame for debugging. Default is False.
return_df (bool, optional) – If True, return the processed data as a LazyFrame instead of an ImpactAnalyzer instance. Default is False.
- Returns:
ImpactAnalyzer instance, or LazyFrame if return_df or return_wide_df is True.
- Return type:
ImpactAnalyzer or pl.LazyFrame
- Raises:
ValueError – If an empty list of source files is provided.
Examples
>>> ia = ImpactAnalyzer.from_pdc("CDH_Metrics_ImpactAnalyzer.json") >>> ia.overall_summary().collect()
- classmethod from_vbd(vbd_source: os.PathLike | str, *, return_df: bool = False) ImpactAnalyzer | polars.LazyFrame | None¶
Create an ImpactAnalyzer instance from VBD data.
Processes VBD Actuals or Scenario Planner Actuals data to reconstruct Impact Analyzer experiment metrics. Provides more flexible time ranges and data selection compared to PDC exports.
Value Lift is calculated from ValuePerImpression since raw value data is available in VBD exports.
- Parameters:
vbd_source (Union[os.PathLike, str]) – Path to VBD export file (parquet, csv, ndjson, or zip).
return_df (bool, optional) – If True, return processed data as LazyFrame instead of ImpactAnalyzer instance. Default is False.
- Returns:
ImpactAnalyzer instance, LazyFrame if return_df is True, or None if the source contains no data.
- Return type:
ImpactAnalyzer or pl.LazyFrame or None
Examples
>>> ia = ImpactAnalyzer.from_vbd("ScenarioPlannerActuals.zip") >>> ia.summary_by_channel().collect()
- classmethod from_ih(ih_source: os.PathLike | str, *, return_df: bool = False) ImpactAnalyzer | polars.LazyFrame | None¶
- Abstractmethod:
- Parameters:
ih_source (Union[os.PathLike, str])
return_df (bool)
- Return type:
Union[ImpactAnalyzer, polars.LazyFrame, None]
Create an ImpactAnalyzer instance from Interaction History data.
Note
This method is not yet implemented.
Reconstructs experiment metrics from Interaction History data, allowing analysis of experiments using detailed interaction-level records.
- Parameters:
ih_source (Union[os.PathLike, str]) – Path to Interaction History export file.
return_df (bool, optional) – If True, return processed data as LazyFrame instead of ImpactAnalyzer instance. Default is False.
- Returns:
ImpactAnalyzer instance, LazyFrame if return_df is True, or None if the source contains no data.
- Return type:
ImpactAnalyzer or pl.LazyFrame or None
- Raises:
NotImplementedError – This method is not yet implemented.
- classmethod _normalize_pdc_ia_data(json_data: dict, *, query: pdstools.utils.types.QUERY | None = None, return_wide_df: bool = False) polars.LazyFrame¶
Transform PDC Impact Analyzer JSON into normalized long format.
Converts the hierarchical PDC JSON structure (organized by experiments) into a flat structure organized by control groups with impression and accept counts.
- Parameters:
- Returns:
Normalized data with columns: SnapshotTime, Channel, ControlGroup, Impressions, Accepts, ValuePerImpression, Pega_ValueLift.
- Return type:
pl.LazyFrame
- summary_by_channel() polars.LazyFrame¶
Get experiment summary pivoted by channel.
Returns experiment lift metrics (CTR_Lift and Value_Lift) for each experiment, with one row per channel.
- Returns:
Wide-format summary with columns:
Channel: Channel name
CTR_Lift <Experiment>: Engagement lift for each experiment
Value_Lift <Experiment>: Value lift for each experiment
- Return type:
pl.LazyFrame
See also
overall_summarySummary without channel breakdown.
summarize_experimentsLong-format experiment summary.
Examples
>>> ia.summary_by_channel().collect()
- overall_summary() polars.LazyFrame¶
Get overall experiment summary aggregated across all channels.
Returns experiment lift metrics (CTR_Lift and Value_Lift) for each experiment, aggregated across all data.
- Returns:
Single-row wide-format summary with columns:
CTR_Lift <Experiment>: Engagement lift for each experiment
Value_Lift <Experiment>: Value lift for each experiment
- Return type:
pl.LazyFrame
See also
summary_by_channelSummary with channel breakdown.
summarize_experimentsLong-format experiment summary.
Examples
>>> ia.overall_summary().collect()
- summarize_control_groups(by: List[str] | List[polars.Expr] | str | polars.Expr | None = None, drop_internal_cols: bool = True) polars.LazyFrame¶
Aggregate metrics by control group.
Summarizes impressions, accepts, CTR, and value metrics for each control group, optionally grouped by additional dimensions.
- Parameters:
- Returns:
Aggregated metrics with columns: ControlGroup, Impressions, Accepts, CTR, ValuePerImpression, plus any grouping columns.
- Return type:
pl.LazyFrame
Examples
>>> ia.summarize_control_groups().collect() >>> ia.summarize_control_groups(by="Channel").collect()
- summarize_experiments(by: List[str] | List[polars.Expr] | str | polars.Expr | None = None) polars.LazyFrame¶
Summarize experiment metrics comparing test vs control groups.
Computes lift metrics for each defined experiment by comparing test and control group performance.
Note
Returns all default experiments regardless of whether they are active in the data. Experiments without data will have null values for all metrics (Impressions, Accepts, CTR_Lift, Value_Lift, etc.).
- Parameters:
by (Optional[Union[List[str], List[pl.Expr], str, pl.Expr]], optional) – Column name(s) or expression(s) to group by. Default is None (aggregate all data).
- Returns:
Experiment summary with columns:
Experiment: Experiment name
Test, Control: Control group names for the experiment
Impressions_Test, Impressions_Control: Impression counts (null if not active)
Accepts_Test, Accepts_Control: Accept counts (null if not active)
CTR_Test, CTR_Control: Click-through rates (null if not active)
Control_Fraction: Fraction of impressions in control group
CTR_Lift: Engagement lift (null if experiment not active)
Value_Lift: Value lift (null if experiment not active)
- Return type:
pl.LazyFrame
See also
summarize_control_groupsLower-level control group aggregation.
overall_summaryPivoted overall summary.
summary_by_channelPivoted summary by channel.
Examples
>>> ia.summarize_experiments().collect() >>> ia.summarize_experiments(by="Channel").collect()