pdstools.impactanalyzer.statistics¶

Statistical calculations for Impact Analyzer experiments.

Confidence intervals, significance testing, and sample-size planning are integral to interpreting Impact Analyzer results — without them a reported lift cannot be distinguished from noise.

Formulas follow Pega’s server-side implementation and have been validated for PDC parity. Scenario Planner Actuals validation is pending.

Key implementation details¶

Pega stores standard errors (SE), not confidence intervals. The z-score (1.96) is applied only at the significance / display level.
The lift CI uses the delta method for the ratio estimator: SE(lift) = (1 / ctrl) · √(SE_t² + (test / ctrl)² · SE_c²).
For value metrics Pega computes variance as p(1-p) · AV² (Bernoulli scaled by action value), not a Poisson approximation.

Attributes¶

`Z_95`	Two-sided 95 % z-critical value used by Pega.
`FORMULAS`

Classes¶

`Formula`	Structured representation of a statistical formula.
`LiftResult`	Result of a lift calculation with standard error.

Functions¶

`accept_rate`(→ float)	Accept / click-through rate.
`binomial_se`(→ float)	Standard error of the accept rate: `√(p(1-p) / n)`.
`binomial_ci`(→ float)	Binomial CI half-width: `z · √(p(1-p) / n)`.
`value_variance`(→ float)	Per-observation Bernoulli variance of the value metric.
`value_se`(→ float)	SE of value per impression: `√(Var / n)`.
`calculate_lift`(→ float)	Relative lift: `(test - control) / control`.
`lift_pl`(→ polars.Expr)	Polars expression for relative lift between two columns.
`lift_se`(→ float)	Delta-method standard error for the lift ratio `test / control - 1`.
`is_significant`(→ bool)	`True` when the CI does not cross zero.
`calculate_engagement_lift`(→ LiftResult)	Engagement lift with delta-method SE.
`calculate_value_lift`(→ LiftResult)	Value-per-impression lift with delta-method CI.
`required_sample_size`(→ int)	Required total impressions for a two-proportion z-test.

Module Contents¶

Z_95: float = 1.96¶: Two-sided 95 % z-critical value used by Pega.

class Formula¶

Structured representation of a statistical formula.

name¶

Short identifier, e.g. "accept_rate".

Type:: str

latex¶

Raw LaTeX expression (no placeholder substitution — the UI renders the symbolic form alongside a separate substitution block showing the numeric values).

Type:: str

description¶

One-line plain-English description.

Type:: str

name: str¶

latex: str¶

description: str¶

FORMULAS: dict[str, Formula]¶

class LiftResult¶

Result of a lift calculation with standard error.

lift¶

Relative lift (test - control) / control.

Type:: float

se¶

Delta-method standard error for lift. This is the full-precision SE without any z-multiplier.

Type:: float

significant¶

True when the CI does not cross zero. The check uses lift ± z * se where z = 1.96 (95 % level) after rounding se to 4 decimal places, matching Pega’s Math.round(error * 10000.0) / 10000.0.

Type:: bool

test_rate¶

Observed test-group rate (accept rate or value per impression).

Type:: float

control_rate¶

Observed control-group rate.

Type:: float

test_se¶

Standard error of the test rate.

Type:: float

control_se¶

Standard error of the control rate.

Type:: float

Notes

Pega stores standard errors, not confidence intervals. The se field is the raw SE. Call ci_95() to obtain the 95 % CI half-width (Z_95 * se).

lift: float¶

se: float¶

significant: bool¶

test_rate: float¶

control_rate: float¶

test_se: float¶

control_se: float¶

ci_95() → float¶

Return the 95 % confidence-interval half-width.

Returns:: Z_95 * self.se (i.e. 1.96 * se).
Return type:: float

accept_rate(accepts: int, impressions: int) → float¶

Accept / click-through rate.

In Pega Accept = Accepted + Clicked (both count as positive outcomes).

Parameters:

accepts (int) – Number of positive outcomes.
impressions (int) – Total number of impressions.

Returns:

accepts / impressions, or 0.0 when impressions ≤ 0.

Return type:

float

binomial_se(accepts: int, impressions: int) → float¶

Standard error of the accept rate: √(p(1-p) / n).

This matches what Pega stores as TestAcceptRateCI / ControlAcceptRateCI in the ConfidenceIntervalCalculation sheet — note that despite the column name it is a SE, not a CI.

Parameters:

accepts (int) – Number of positive outcomes.
impressions (int) – Total number of impressions.

Returns:

√(p(1-p) / n), or 0.0 when impressions ≤ 0 or the rate is 0 or 1.

Return type:

float

Notes

Uses the Wald (normal-approximation) formula. For extreme p (close to 0 or 1) or small n this can under-cover; Wilson or Clopper-Pearson intervals are more robust alternatives.

binomial_ci(accepts: int, impressions: int, z: float = Z_95) → float¶

Binomial CI half-width: z · √(p(1-p) / n).

Returns 0.0 when impressions ≤ 0 or the rate is 0 or 1.

Parameters:

accepts (int)
impressions (int)
z (float)

Return type:

float

value_variance(accepts: int, impressions: int, action_value: float) → float¶

Per-observation Bernoulli variance of the value metric.

Pega computes p(1-p) · AV². Each impression is worth either action_value (with probability p) or 0.

This matches TestVariance / ControlVariance in the ConfidenceIntervalCalculation sheet.

Parameters:

accepts (int)
impressions (int)
action_value (float)

Return type:

float

value_se(accepts: int, impressions: int, action_value: float) → float¶

SE of value per impression: √(Var / n).

Matches Pega’s TestInterval / ControlInterval.

Parameters:

accepts (int)
impressions (int)
action_value (float)

Return type:

float

calculate_lift(test: float, control: float) → float¶

Relative lift: (test - control) / control.

Returns 0.0 when control ≤ 0.

Parameters:

test (float)
control (float)

Return type:

float

lift_pl(test_col: str, control_col: str) → polars.Expr¶

Polars expression for relative lift between two columns.

Intended for pl.LazyFrame.with_columns() so the formula is defined once.

Returns:

(test - control) / control.

Return type:

pl.Expr

Parameters:

test_col (str)
control_col (str)

lift_se(test: float, control: float, se_test: float, se_control: float) → float¶

Delta-method standard error for the lift ratio test / control - 1.

Formula:

(1 / control) · √(se_test² + (test / control)² · se_control²)

Important

Pass standard errors (no z-multiplier). Passing z-multiplied CI values will inflate the result by z.

Parameters:

test (float) – Test-group rate (accept rate or VPI).
control (float) – Control-group rate.
se_test (float) – Standard error of test.
se_control (float) – Standard error of control.

Returns:

Full-precision SE of the lift (no rounding).

Return type:

float

is_significant(lift: float, se: float, z: float = Z_95) → bool¶

True when the CI does not cross zero.

Tests whether [lift - z·se, lift + z·se] excludes zero, i.e. the lift is statistically significant at the given confidence level. With the default z = 1.96 this is a 95 % two-sided test.

Parameters:

lift (float) – Observed relative lift.
se (float) – Standard error of the lift (not z-multiplied).
z (float, optional) – Critical value. Default 1.96 (95 %).

Returns:

True if the interval excludes zero.

Return type:

bool

calculate_engagement_lift(accepts_test: int, impr_test: int, accepts_control: int, impr_control: int) → LiftResult¶

Engagement lift with delta-method SE.

This is the primary metric in the Impact Analyzer UI.

Parameters:

accepts_test (int) – Positive outcomes in the test group.
impr_test (int) – Total impressions in the test group.
accepts_control (int) – Positive outcomes in the control group.
impr_control (int) – Total impressions in the control group.

Returns:

Lift, SE, and significance for the engagement metric.

Return type:

LiftResult

calculate_value_lift(accepts_test: int, impr_test: int, accepts_control: int, impr_control: int, action_value: float) → LiftResult¶

Value-per-impression lift with delta-method CI.

Pega computes value as accept_rate × action_value with Bernoulli variance p(1-p) · AV².

Parameters:

accepts_test (int) – Positive outcomes in the test group.
impr_test (int) – Total impressions in the test group.
accepts_control (int) – Positive outcomes in the control group.
impr_control (int) – Total impressions in the control group.
action_value (float) – Monetary action value per accept.

Returns:

Lift, SE, and significance for the value metric.

Return type:

LiftResult

required_sample_size(baseline_rate: float, mde: float = 0.05, alpha: float = 0.05, power: float = 0.8, control_ratio: float = 0.02) → int¶

Required total impressions for a two-proportion z-test.

Parameters:

baseline_rate (float) – Expected control-group accept rate.
mde (float) – Minimum detectable effect (relative lift).
alpha (float) – Significance level.
power (float) – Statistical power.
control_ratio (float) – Fraction of traffic allocated to control (default 2 %). This default matches Pega Impact Analyzer’s typical configuration. For general power analysis, 0.5 (equal allocation) is more common.

Returns:

Ceiling of the required sample size.

Return type:

int