pdstools.impactanalyzer.statistics¶
Statistical calculations for Impact Analyzer experiments.
Confidence intervals, significance testing, and sample-size planning are integral to interpreting Impact Analyzer results — without them a reported lift cannot be distinguished from noise.
Formulas follow Pega’s server-side implementation and have been validated for PDC parity. Scenario Planner Actuals validation is pending.
Key implementation details¶
Pega stores standard errors (SE), not confidence intervals. The z-score (1.96) is applied only at the significance / display level.
The lift CI uses the delta method for the ratio estimator:
SE(lift) = (1 / ctrl) · √(SE_t² + (test / ctrl)² · SE_c²).For value metrics Pega computes variance as
p(1-p) · AV²(Bernoulli scaled by action value), not a Poisson approximation.
Attributes¶
Classes¶
Structured representation of a statistical formula. |
|
Result of a lift calculation with standard error. |
Functions¶
|
Accept / click-through rate. |
|
Standard error of the accept rate: |
|
Binomial CI half-width: |
|
Per-observation Bernoulli variance of the value metric. |
|
SE of value per impression: |
|
Relative lift: |
|
Polars expression for relative lift between two columns. |
|
Delta-method standard error for the lift ratio |
|
|
|
Engagement lift with delta-method SE. |
|
Value-per-impression lift with delta-method CI. |
|
Required total impressions for a two-proportion z-test. |
Module Contents¶
- class Formula¶
Structured representation of a statistical formula.
- latex¶
Raw LaTeX expression (no placeholder substitution — the UI renders the symbolic form alongside a separate substitution block showing the numeric values).
- Type:
- class LiftResult¶
Result of a lift calculation with standard error.
- se¶
Delta-method standard error for lift. This is the full-precision SE without any z-multiplier.
- Type:
- significant¶
Truewhen the CI does not cross zero. The check useslift ± z * sewherez = 1.96(95 % level) after roundingseto 4 decimal places, matching Pega’sMath.round(error * 10000.0) / 10000.0.- Type:
Notes
Pega stores standard errors, not confidence intervals. The
sefield is the raw SE. Callci_95()to obtain the 95 % CI half-width (Z_95 * se).
- accept_rate(accepts: int, impressions: int) float¶
Accept / click-through rate.
In Pega Accept = Accepted + Clicked (both count as positive outcomes).
- binomial_se(accepts: int, impressions: int) float¶
Standard error of the accept rate:
√(p(1-p) / n).This matches what Pega stores as TestAcceptRateCI / ControlAcceptRateCI in the
ConfidenceIntervalCalculationsheet — note that despite the column name it is a SE, not a CI.- Parameters:
- Returns:
√(p(1-p) / n), or0.0when impressions ≤ 0 or the rate is 0 or 1.- Return type:
Notes
Uses the Wald (normal-approximation) formula. For extreme p (close to 0 or 1) or small n this can under-cover; Wilson or Clopper-Pearson intervals are more robust alternatives.
- binomial_ci(accepts: int, impressions: int, z: float = Z_95) float¶
Binomial CI half-width:
z · √(p(1-p) / n).Returns
0.0when impressions ≤ 0 or the rate is 0 or 1.
- value_variance(accepts: int, impressions: int, action_value: float) float¶
Per-observation Bernoulli variance of the value metric.
Pega computes
p(1-p) · AV². Each impression is worth eitheraction_value(with probability p) or 0.This matches TestVariance / ControlVariance in the
ConfidenceIntervalCalculationsheet.
- value_se(accepts: int, impressions: int, action_value: float) float¶
SE of value per impression:
√(Var / n).Matches Pega’s TestInterval / ControlInterval.
- calculate_lift(test: float, control: float) float¶
Relative lift:
(test - control) / control.Returns
0.0when control ≤ 0.
- lift_pl(test_col: str, control_col: str) polars.Expr¶
Polars expression for relative lift between two columns.
Intended for
pl.LazyFrame.with_columns()so the formula is defined once.
- lift_se(test: float, control: float, se_test: float, se_control: float) float¶
Delta-method standard error for the lift ratio
test / control - 1.Formula:
(1 / control) · √(se_test² + (test / control)² · se_control²)
Important
Pass standard errors (no z-multiplier). Passing z-multiplied CI values will inflate the result by z.
- is_significant(lift: float, se: float, z: float = Z_95) bool¶
Truewhen the CI does not cross zero.Tests whether
[lift - z·se, lift + z·se]excludes zero, i.e. the lift is statistically significant at the given confidence level. With the defaultz = 1.96this is a 95 % two-sided test.
- calculate_engagement_lift(accepts_test: int, impr_test: int, accepts_control: int, impr_control: int) LiftResult¶
Engagement lift with delta-method SE.
This is the primary metric in the Impact Analyzer UI.
- Parameters:
- Returns:
Lift, SE, and significance for the engagement metric.
- Return type:
- calculate_value_lift(accepts_test: int, impr_test: int, accepts_control: int, impr_control: int, action_value: float) LiftResult¶
Value-per-impression lift with delta-method CI.
Pega computes value as
accept_rate × action_valuewith Bernoulli variancep(1-p) · AV².- Parameters:
- Returns:
Lift, SE, and significance for the value metric.
- Return type:
- required_sample_size(baseline_rate: float, mde: float = 0.05, alpha: float = 0.05, power: float = 0.8, control_ratio: float = 0.02) int¶
Required total impressions for a two-proportion z-test.
- Parameters:
baseline_rate (float) – Expected control-group accept rate.
mde (float) – Minimum detectable effect (relative lift).
alpha (float) – Significance level.
power (float) – Statistical power.
control_ratio (float) – Fraction of traffic allocated to control (default 2 %). This default matches Pega Impact Analyzer’s typical configuration. For general power analysis, 0.5 (equal allocation) is more common.
- Returns:
Ceiling of the required sample size.
- Return type: