pdstools.decision_analyzer._scoring

Scoring/ranking/sensitivity namespace for DecisionAnalyzer.

Methods in this module are exposed via da.scoring.<method>. They cover re-ranking the priority formula, win/loss analysis, sensitivity of the PVCL components, lever search, and quantile-based thresholding.

Classes

Scoring

Scoring, ranking and lever-analysis queries.

Module Contents

class Scoring(da: pdstools.decision_analyzer.DecisionAnalyzer.DecisionAnalyzer)

Scoring, ranking and lever-analysis queries.

Accessed via DecisionAnalyzer.scoring.

Parameters:

da (pdstools.decision_analyzer.DecisionAnalyzer.DecisionAnalyzer)

da
re_rank(additional_filters: polars.Expr | list[polars.Expr] | None = None, overrides: list[polars.Expr] | None = None) polars.LazyFrame

Recalculate priority and rank for all PVCL component combinations.

Computes five alternative priority scores by selectively dropping one component at a time (Propensity, Value, Context Weight, Levers) and ranks actions within each interaction for each variant. This is the foundation for sensitivity analysis.

Parameters:
  • additional_filters (pl.Expr or list of pl.Expr, optional) – Filters applied to the sample before ranking.

  • overrides (list of pl.Expr, optional) – Column override expressions applied before priority calculation (e.g. to simulate lever adjustments).

Returns:

Sample data augmented with prio_* and rank_* columns for each PVCL variant.

Return type:

pl.LazyFrame

get_selected_group_rank_boundaries(group_filter: polars.Expr | list[polars.Expr], additional_filters: polars.Expr | list[polars.Expr] | None = None) polars.LazyFrame

Compute selected-group rank boundaries per interaction.

For each interaction where the selected comparison group is present, returns the best (lowest) and worst (highest) rank observed for the selected rows in arbitration-relevant stages.

Parameters:
  • group_filter (polars.Expr | list[polars.Expr])

  • additional_filters (polars.Expr | list[polars.Expr] | None)

Return type:

polars.LazyFrame

_remaining_at_stage(stage: str | None = None, additional_filters: polars.Expr | list[polars.Expr] | None = None) polars.LazyFrame

Return sample rows remaining at stage.

Uses the aggregate_remaining_per_stage logic: rows whose stage order is >= the selected stage are “remaining” there. If stage is None, falls back to rows with non-null Priority.

Parameters:
  • stage (str | None)

  • additional_filters (polars.Expr | list[polars.Expr] | None)

Return type:

polars.LazyFrame

get_sensitivity(win_rank: int = 1, group_filter: polars.Expr | list[polars.Expr] | None = None, additional_filters: polars.Expr | list[polars.Expr] | None = None) polars.LazyFrame

Global or local sensitivity of the prioritization factors.

Parameters:
  • win_rank (int) – Maximum rank to be considered a winner.

  • group_filter (pl.Expr, optional) – Selected offers, only used in local sensitivity analysis. When None (global), results are cached by win_rank.

  • additional_filters (pl.Expr or list[pl.Expr], optional) – Extra filters applied to the sample before re-ranking (e.g. channel filter). When set, caching is bypassed.

Return type:

pl.LazyFrame

get_thresholding_data(fld: str, quantile_range=range(10, 100, 10)) polars.DataFrame

Quantile-based thresholding analysis at Arbitration.

Computes counts and threshold values at each quantile for the given field (fld). Results are cached per (fld, quantile_range).

Parameters:
  • fld (str) – Column name to compute quantiles for (e.g. "Propensity").

  • quantile_range (range, default range(10, 100, 10)) – Percentile breakpoints to compute.

Returns:

Long-format table with columns Decile, Count, Threshold, and the stage-level column.

Return type:

pl.DataFrame

priority_component_distribution(component: str, granularity: str, stage: str | None = None, additional_filters: polars.Expr | list[polars.Expr] | None = None) polars.LazyFrame

Data for a single component’s distribution, grouped by granularity.

Parameters:
  • component (str) – Column name of the component to analyze.

  • granularity (str) – Column to group by (e.g. “Issue”, “Group”, “Action”).

  • stage (str, optional) – Filter to actions remaining at this stage. If None, uses all rows with non-null Priority.

  • additional_filters (pl.Expr or list[pl.Expr], optional) – Extra filters applied to the sample (e.g. channel filter).

Return type:

polars.LazyFrame

all_components_distribution(granularity: str, stage: str | None = None, additional_filters: polars.Expr | list[polars.Expr] | None = None) polars.LazyFrame

Data for the overview panel: all prioritization components at once.

Parameters:
  • granularity (str) – Column to group by.

  • stage (str, optional) – Filter to actions remaining at this stage.

  • additional_filters (pl.Expr or list[pl.Expr], optional) – Extra filters applied to the sample (e.g. channel filter).

Return type:

polars.LazyFrame

get_win_loss_distribution_data(level: str | list[str], win_rank: int | None = None, additional_filters: polars.Expr | list[polars.Expr] | None = None, group_filter: polars.Expr | list[polars.Expr] | None = None, status: Literal['Wins', 'Losses'] | None = None, top_k: int | None = None) polars.LazyFrame

Win/loss distribution at a given scope level.

Operates in two modes depending on whether group_filter is provided:

  • Without group_filter (rank-based): uses pre-aggregated data and a fixed win_rank threshold to split wins/losses.

  • With group_filter (group-based): uses sample data and per-interaction rank boundaries of the selected group to identify actions it beats (Wins) or loses to (Losses).

Parameters:
  • level (str or list of str) – Column(s) to group the distribution by (e.g. "Action").

  • win_rank (int, optional) – Fixed rank threshold (required when group_filter is None).

  • additional_filters (pl.Expr or list of pl.Expr, optional) – Extra filters applied before aggregation.

  • group_filter (pl.Expr or list of pl.Expr, optional) – Filter defining the comparison group.

  • status ({"Wins", "Losses"}, optional) – Required when group_filter is provided.

  • top_k (int, optional) – Limit the number of rows returned (group_filter mode only).

Return type:

pl.LazyFrame

_winning_from(interactions: polars.LazyFrame, win_rank: int, groupby_cols: list[str], top_k: int = 20, additional_filters: polars.Expr | list[polars.Expr] | None = None) polars.LazyFrame

Actions beaten by the comparison group in its winning interactions.

Parameters:
  • interactions (polars.LazyFrame) – Interaction IDs where the comparison group wins.

  • win_rank (int) – Rank threshold used to define “winning”.

  • groupby_cols (list[str]) – Columns to group by (e.g. ["Issue", "Group", "Action"]).

  • top_k (int) – Return only the top-k entries.

  • additional_filters (polars.Expr | list[polars.Expr] | None) – Optional extra filters (e.g. channel filter).

Return type:

polars.LazyFrame

_losing_to(interactions: polars.LazyFrame, win_rank: int, groupby_cols: list[str], top_k: int = 20, additional_filters: polars.Expr | list[polars.Expr] | None = None) polars.LazyFrame

Actions that beat the comparison group in its losing interactions.

Parameters:
  • interactions (polars.LazyFrame) – Interaction IDs where the comparison group loses.

  • win_rank (int) – Rank threshold used to define “winning”.

  • groupby_cols (list[str]) – Columns to group by (e.g. ["Issue", "Group", "Action"]).

  • top_k (int) – Return only the top-k entries.

  • additional_filters (polars.Expr | list[polars.Expr] | None) – Optional extra filters (e.g. channel filter).

Return type:

polars.LazyFrame

get_winning_or_losing_interactions(group_filter: polars.Expr | list[polars.Expr], win: bool, additional_filters: polars.Expr | list[polars.Expr] | None = None) polars.LazyFrame

Interaction IDs where the comparison group wins or loses.

Parameters:
  • group_filter (pl.Expr or list of pl.Expr) – Filter defining the comparison group.

  • win (bool) – If True, return interactions where the group wins (there are lower-ranked actions outside the group). If False, return interactions where the group loses (there are higher-ranked actions outside the group).

  • additional_filters (pl.Expr or list of pl.Expr, optional) – Extra filters (e.g. channel filter).

Returns:

Single-column frame of unique Interaction ID values.

Return type:

pl.LazyFrame

get_win_loss_counts(group_filter: polars.Expr | list[polars.Expr], win_rank: int = 1, additional_filters: polars.Expr | list[polars.Expr] | None = None) dict[str, int]

Count wins and losses for the comparison group at a given rank threshold.

A win is an interaction where at least one member of the comparison group achieves a rank of win_rank or better (lower). A loss is any interaction where the group participates but none rank that high.

Parameters:
  • group_filter (polars.Expr | list[polars.Expr]) – Filter expression(s) defining the comparison group.

  • win_rank (int) – Rank threshold. The group “wins” when its best rank <= win_rank.

  • additional_filters (polars.Expr | list[polars.Expr] | None) – Optional extra filters (e.g. channel/direction).

Return type:

dict with keys "wins", "losses", "total".

get_win_loss_distributions(interactions_win: polars.LazyFrame, interactions_loss: polars.LazyFrame, groupby_cols: list[str], top_k: int, additional_filters: polars.Expr | list[polars.Expr] | None = None, group_filter: polars.Expr | list[polars.Expr] | None = None, win_rank: int | None = None) tuple[polars.LazyFrame, polars.LazyFrame]

Distribution of actions the comparison group wins from and loses to.

Computes two aggregated distributions in a single pass over the stage-filtered data:

  • winning_from: actions ranked below the comparison group (i.e. actions it beats).

  • losing_to: actions ranked above the comparison group (i.e. actions that beat it).

Either group_filter or win_rank must be provided to define the rank boundary. When group_filter is given the boundary is the per-interaction best/worst rank of the selected group. When only win_rank is given a fixed rank threshold is used instead.

Parameters:
  • interactions_win (polars.LazyFrame) – Interaction IDs where the comparison group wins (from get_winning_or_losing_interactions(win=True)).

  • interactions_loss (polars.LazyFrame) – Interaction IDs where the comparison group loses (from get_winning_or_losing_interactions(win=False)).

  • groupby_cols (list[str]) – Columns to group by (e.g. ["Action"]).

  • top_k (int) – Return only the top-k entries per distribution.

  • additional_filters (polars.Expr | list[polars.Expr] | None) – Optional extra filters (e.g. channel/direction).

  • group_filter (polars.Expr | list[polars.Expr] | None) – Filter expression(s) defining the comparison group.

  • win_rank (int | None) – Fixed rank threshold. Required when group_filter is None.

Returns:

  • tuple of (winning_from, losing_to), both pl.LazyFrame with

  • columns from groupby_cols plus Decisions.

Return type:

tuple[polars.LazyFrame, polars.LazyFrame]

get_win_distribution_data(lever_condition: polars.Expr, lever_value: float | None = None, all_interactions: int | None = None) polars.DataFrame

Calculate win distribution data for business lever analysis.

This method generates distribution data showing how actions perform in arbitration decisions, both in baseline conditions and optionally with lever adjustments applied.

Parameters:
  • lever_condition (pl.Expr) –

    Polars expression defining which actions to apply the lever to. Example: pl.col(“Action”) == “SpecificAction” or

    (pl.col(“Issue”) == “Service”) & (pl.col(“Group”) == “Cards”)

  • lever_value (float, optional) – The lever multiplier value to apply to selected actions. If None, returns baseline distribution only. If provided, returns both original and lever-adjusted win counts.

  • all_interactions (int, optional) – Total number of interactions to calculate “no winner” count. If provided, enables calculation of interactions without any winner. If None, “no winner” data is not calculated.

Returns:

DataFrame containing win distribution with columns: - Issue, Group, Action: Action identifiers - original_win_count: Number of rank-1 wins in baseline scenario - new_win_count: Number of rank-1 wins after lever adjustment (only if lever_value provided) - n_decisions_survived_to_arbitration: Number of arbitration decisions the action participated in - selected_action: “Selected” for actions matching lever_condition, “Rest” for others - no_winner_count: Number of interactions without any winner (only if all_interactions provided)

Return type:

pl.DataFrame

Notes

  • Only includes actions that survive to arbitration stage

  • Win counts represent rank-1 (first place) finishes in arbitration decisions

  • This is a zero-sum analysis: boosting selected actions suppresses others

  • Results are sorted by win count (new_win_count if available, else original_win_count)

  • When all_interactions is provided, “no winner” represents interactions without any rank-1 winner

Examples

Get baseline distribution for a specific action: >>> lever_cond = pl.col(“Action”) == “MyAction” >>> baseline = decision_analyzer.get_win_distribution_data(lever_cond)

Get distribution with 2x lever applied to service actions: >>> lever_cond = pl.col(“Issue”) == “Service” >>> with_lever = decision_analyzer.get_win_distribution_data(lever_cond, 2.0)

Get distribution with no winner count: >>> total_interactions = 10000 >>> with_no_winner = decision_analyzer.get_win_distribution_data(lever_cond, 2.0, total_interactions)

find_lever_value(lever_condition: polars.Expr, target_win_percentage: float, win_rank: int = 1, low: float = 0, high: float = 100, precision: float = 0.01, ranking_stages: list[str] | None = None) float

Binary search algorithm to find lever value needed to achieve a desired win percentage.

Parameters:
  • lever_condition (pl.Expr) – Polars expression that defines which actions should receive the lever

  • target_win_percentage (float) – The desired win percentage (0-100)

  • win_rank (int, default 1) – Consider actions winning if they rank <= this value

  • low (float, default 0) – Lower bound for lever search range

  • high (float, default 100) – Upper bound for lever search range

  • precision (float, default 0.01) – Search precision - smaller values give more accurate results

  • ranking_stages (list[str], optional) – List of stages to include in analysis. Defaults to [“Arbitration”]

Returns:

The lever value needed to achieve the target win percentage

Return type:

float

Raises:

ValueError – If the target win percentage cannot be achieved within the search range