pdstools.impactanalyzer.statistics

Statistical calculations for Impact Analyzer experiments.

Confidence intervals, significance testing, and sample-size planning are integral to interpreting Impact Analyzer results — without them a reported lift cannot be distinguished from noise.

Formulas follow Pega’s server-side implementation and have been validated for PDC parity. Scenario Planner Actuals validation is pending.

Key implementation details

  • Pega stores standard errors (SE), not confidence intervals. The z-score (1.96) is applied only at the significance / display level.

  • The lift CI uses the delta method for the ratio estimator: SE(lift) = (1 / ctrl) · √(SE_t² + (test / ctrl)² · SE_c²).

  • For value metrics Pega computes variance as p(1-p) · AV² (Bernoulli scaled by action value), not a Poisson approximation.

Attributes

Z_95

Two-sided 95 % z-critical value used by Pega.

FORMULAS

Classes

Formula

Structured representation of a statistical formula.

LiftResult

Result of a lift calculation with standard error.

Functions

accept_rate(→ float)

Accept / click-through rate.

binomial_se(→ float)

Standard error of the accept rate: √(p(1-p) / n).

binomial_ci(→ float)

Binomial CI half-width: z · √(p(1-p) / n).

value_variance(→ float)

Per-observation Bernoulli variance of the value metric.

value_se(→ float)

SE of value per impression: √(Var / n).

calculate_lift(→ float)

Relative lift: (test - control) / control.

lift_pl(→ polars.Expr)

Polars expression for relative lift between two columns.

lift_se(→ float)

Delta-method standard error for the lift ratio test / control - 1.

is_significant(→ bool)

True when the CI does not cross zero.

calculate_engagement_lift(→ LiftResult)

Engagement lift with delta-method SE.

calculate_value_lift(→ LiftResult)

Value-per-impression lift with delta-method CI.

required_sample_size(→ int)

Required total impressions for a two-proportion z-test.

Module Contents

Z_95: float = 1.96

Two-sided 95 % z-critical value used by Pega.

class Formula

Structured representation of a statistical formula.

name

Short identifier, e.g. "accept_rate".

Type:

str

latex

Raw LaTeX expression (no placeholder substitution — the UI renders the symbolic form alongside a separate substitution block showing the numeric values).

Type:

str

description

One-line plain-English description.

Type:

str

name: str
latex: str
description: str
FORMULAS: dict[str, Formula]
class LiftResult

Result of a lift calculation with standard error.

lift

Relative lift (test - control) / control.

Type:

float

se

Delta-method standard error for lift. This is the full-precision SE without any z-multiplier.

Type:

float

significant

True when the CI does not cross zero. The check uses lift ± z * se where z = 1.96 (95 % level) after rounding se to 4 decimal places, matching Pega’s Math.round(error * 10000.0) / 10000.0.

Type:

bool

test_rate

Observed test-group rate (accept rate or value per impression).

Type:

float

control_rate

Observed control-group rate.

Type:

float

test_se

Standard error of the test rate.

Type:

float

control_se

Standard error of the control rate.

Type:

float

Notes

Pega stores standard errors, not confidence intervals. The se field is the raw SE. Call ci_95() to obtain the 95 % CI half-width (Z_95 * se).

lift: float
se: float
significant: bool
test_rate: float
control_rate: float
test_se: float
control_se: float
ci_95() float

Return the 95 % confidence-interval half-width.

Returns:

Z_95 * self.se (i.e. 1.96 * se).

Return type:

float

accept_rate(accepts: int, impressions: int) float

Accept / click-through rate.

In Pega Accept = Accepted + Clicked (both count as positive outcomes).

Parameters:
  • accepts (int) – Number of positive outcomes.

  • impressions (int) – Total number of impressions.

Returns:

accepts / impressions, or 0.0 when impressions ≤ 0.

Return type:

float

binomial_se(accepts: int, impressions: int) float

Standard error of the accept rate: √(p(1-p) / n).

This matches what Pega stores as TestAcceptRateCI / ControlAcceptRateCI in the ConfidenceIntervalCalculation sheet — note that despite the column name it is a SE, not a CI.

Parameters:
  • accepts (int) – Number of positive outcomes.

  • impressions (int) – Total number of impressions.

Returns:

√(p(1-p) / n), or 0.0 when impressions ≤ 0 or the rate is 0 or 1.

Return type:

float

Notes

Uses the Wald (normal-approximation) formula. For extreme p (close to 0 or 1) or small n this can under-cover; Wilson or Clopper-Pearson intervals are more robust alternatives.

binomial_ci(accepts: int, impressions: int, z: float = Z_95) float

Binomial CI half-width: z · √(p(1-p) / n).

Returns 0.0 when impressions ≤ 0 or the rate is 0 or 1.

Parameters:
Return type:

float

value_variance(accepts: int, impressions: int, action_value: float) float

Per-observation Bernoulli variance of the value metric.

Pega computes p(1-p) · AV². Each impression is worth either action_value (with probability p) or 0.

This matches TestVariance / ControlVariance in the ConfidenceIntervalCalculation sheet.

Parameters:
Return type:

float

value_se(accepts: int, impressions: int, action_value: float) float

SE of value per impression: √(Var / n).

Matches Pega’s TestInterval / ControlInterval.

Parameters:
Return type:

float

calculate_lift(test: float, control: float) float

Relative lift: (test - control) / control.

Returns 0.0 when control ≤ 0.

Parameters:
Return type:

float

lift_pl(test_col: str, control_col: str) polars.Expr

Polars expression for relative lift between two columns.

Intended for pl.LazyFrame.with_columns() so the formula is defined once.

Returns:

(test - control) / control.

Return type:

pl.Expr

Parameters:
  • test_col (str)

  • control_col (str)

lift_se(test: float, control: float, se_test: float, se_control: float) float

Delta-method standard error for the lift ratio test / control - 1.

Formula:

(1 / control) · √(se_test² + (test / control)² · se_control²)

Important

Pass standard errors (no z-multiplier). Passing z-multiplied CI values will inflate the result by z.

Parameters:
  • test (float) – Test-group rate (accept rate or VPI).

  • control (float) – Control-group rate.

  • se_test (float) – Standard error of test.

  • se_control (float) – Standard error of control.

Returns:

Full-precision SE of the lift (no rounding).

Return type:

float

is_significant(lift: float, se: float, z: float = Z_95) bool

True when the CI does not cross zero.

Tests whether [lift - z·se, lift + z·se] excludes zero, i.e. the lift is statistically significant at the given confidence level. With the default z = 1.96 this is a 95 % two-sided test.

Parameters:
  • lift (float) – Observed relative lift.

  • se (float) – Standard error of the lift (not z-multiplied).

  • z (float, optional) – Critical value. Default 1.96 (95 %).

Returns:

True if the interval excludes zero.

Return type:

bool

calculate_engagement_lift(accepts_test: int, impr_test: int, accepts_control: int, impr_control: int) LiftResult

Engagement lift with delta-method SE.

This is the primary metric in the Impact Analyzer UI.

Parameters:
  • accepts_test (int) – Positive outcomes in the test group.

  • impr_test (int) – Total impressions in the test group.

  • accepts_control (int) – Positive outcomes in the control group.

  • impr_control (int) – Total impressions in the control group.

Returns:

Lift, SE, and significance for the engagement metric.

Return type:

LiftResult

calculate_value_lift(accepts_test: int, impr_test: int, accepts_control: int, impr_control: int, action_value: float) LiftResult

Value-per-impression lift with delta-method CI.

Pega computes value as accept_rate × action_value with Bernoulli variance p(1-p) · AV².

Parameters:
  • accepts_test (int) – Positive outcomes in the test group.

  • impr_test (int) – Total impressions in the test group.

  • accepts_control (int) – Positive outcomes in the control group.

  • impr_control (int) – Total impressions in the control group.

  • action_value (float) – Monetary action value per accept.

Returns:

Lift, SE, and significance for the value metric.

Return type:

LiftResult

required_sample_size(baseline_rate: float, mde: float = 0.05, alpha: float = 0.05, power: float = 0.8, control_ratio: float = 0.02) int

Required total impressions for a two-proportion z-test.

Parameters:
  • baseline_rate (float) – Expected control-group accept rate.

  • mde (float) – Minimum detectable effect (relative lift).

  • alpha (float) – Significance level.

  • power (float) – Statistical power.

  • control_ratio (float) – Fraction of traffic allocated to control (default 2 %). This default matches Pega Impact Analyzer’s typical configuration. For general power analysis, 0.5 (equal allocation) is more common.

Returns:

Ceiling of the required sample size.

Return type:

int