pdstools.decision_analyzer._scoring =================================== .. py:module:: pdstools.decision_analyzer._scoring .. autoapi-nested-parse:: Scoring/ranking/sensitivity namespace for :class:`DecisionAnalyzer`. Methods in this module are exposed via ``da.scoring.``. They cover re-ranking the priority formula, win/loss analysis, sensitivity of the PVCL components, lever search, and quantile-based thresholding. Classes ------- .. autoapisummary:: pdstools.decision_analyzer._scoring.Scoring Module Contents --------------- .. py:class:: Scoring(da: pdstools.decision_analyzer.DecisionAnalyzer.DecisionAnalyzer) Scoring, ranking and lever-analysis queries. Accessed via :attr:`DecisionAnalyzer.scoring`. .. py:attribute:: da .. py:method:: re_rank(additional_filters: polars.Expr | list[polars.Expr] | None = None, overrides: list[polars.Expr] | None = None) -> polars.LazyFrame Recalculate priority and rank for all PVCL component combinations. Computes five alternative priority scores by selectively dropping one component at a time (Propensity, Value, Context Weight, Levers) and ranks actions within each interaction for each variant. This is the foundation for sensitivity analysis. :param additional_filters: Filters applied to the sample before ranking. :type additional_filters: pl.Expr or list of pl.Expr, optional :param overrides: Column override expressions applied before priority calculation (e.g. to simulate lever adjustments). :type overrides: list of pl.Expr, optional :returns: Sample data augmented with ``prio_*`` and ``rank_*`` columns for each PVCL variant. :rtype: pl.LazyFrame .. py:method:: get_selected_group_rank_boundaries(group_filter: polars.Expr | list[polars.Expr], additional_filters: polars.Expr | list[polars.Expr] | None = None) -> polars.LazyFrame Compute selected-group rank boundaries per interaction. For each interaction where the selected comparison group is present, returns the best (lowest) and worst (highest) rank observed for the selected rows in arbitration-relevant stages. .. py:method:: _remaining_at_stage(stage: str | None = None, additional_filters: polars.Expr | list[polars.Expr] | None = None) -> polars.LazyFrame Return sample rows remaining at *stage*. Uses the ``aggregate_remaining_per_stage`` logic: rows whose stage order is >= the selected stage are "remaining" there. If *stage* is None, falls back to rows with non-null Priority. .. py:method:: get_sensitivity(win_rank: int = 1, group_filter: polars.Expr | list[polars.Expr] | None = None, additional_filters: polars.Expr | list[polars.Expr] | None = None) -> polars.LazyFrame Global or local sensitivity of the prioritization factors. :param win_rank: Maximum rank to be considered a winner. :type win_rank: int :param group_filter: Selected offers, only used in local sensitivity analysis. When ``None`` (global), results are cached by ``win_rank``. :type group_filter: pl.Expr, optional :param additional_filters: Extra filters applied to the sample before re-ranking (e.g. channel filter). When set, caching is bypassed. :type additional_filters: pl.Expr or list[pl.Expr], optional :rtype: pl.LazyFrame .. py:method:: get_thresholding_data(fld: str, quantile_range=range(10, 100, 10)) -> polars.DataFrame Quantile-based thresholding analysis at Arbitration. Computes counts and threshold values at each quantile for the given field (*fld*). Results are cached per ``(fld, quantile_range)``. :param fld: Column name to compute quantiles for (e.g. ``"Propensity"``). :type fld: str :param quantile_range: Percentile breakpoints to compute. :type quantile_range: range, default ``range(10, 100, 10)`` :returns: Long-format table with columns ``Decile``, ``Count``, ``Threshold``, and the stage-level column. :rtype: pl.DataFrame .. py:method:: priority_component_distribution(component: str, granularity: str, stage: str | None = None, additional_filters: polars.Expr | list[polars.Expr] | None = None) -> polars.LazyFrame Data for a single component's distribution, grouped by granularity. :param component: Column name of the component to analyze. :type component: str :param granularity: Column to group by (e.g. "Issue", "Group", "Action"). :type granularity: str :param stage: Filter to actions remaining at this stage. If None, uses all rows with non-null Priority. :type stage: str, optional :param additional_filters: Extra filters applied to the sample (e.g. channel filter). :type additional_filters: pl.Expr or list[pl.Expr], optional .. py:method:: all_components_distribution(granularity: str, stage: str | None = None, additional_filters: polars.Expr | list[polars.Expr] | None = None) -> polars.LazyFrame Data for the overview panel: all prioritization components at once. :param granularity: Column to group by. :type granularity: str :param stage: Filter to actions remaining at this stage. :type stage: str, optional :param additional_filters: Extra filters applied to the sample (e.g. channel filter). :type additional_filters: pl.Expr or list[pl.Expr], optional .. py:method:: get_win_loss_distribution_data(level: str | list[str], win_rank: int | None = None, additional_filters: polars.Expr | list[polars.Expr] | None = None, group_filter: polars.Expr | list[polars.Expr] | None = None, status: Literal['Wins', 'Losses'] | None = None, top_k: int | None = None) -> polars.LazyFrame Win/loss distribution at a given scope level. Operates in two modes depending on whether *group_filter* is provided: * **Without group_filter** (rank-based): uses pre-aggregated data and a fixed *win_rank* threshold to split wins/losses. * **With group_filter** (group-based): uses sample data and per-interaction rank boundaries of the selected group to identify actions it beats (*Wins*) or loses to (*Losses*). :param level: Column(s) to group the distribution by (e.g. ``"Action"``). :type level: str or list of str :param win_rank: Fixed rank threshold (required when *group_filter* is ``None``). :type win_rank: int, optional :param additional_filters: Extra filters applied before aggregation. :type additional_filters: pl.Expr or list of pl.Expr, optional :param group_filter: Filter defining the comparison group. :type group_filter: pl.Expr or list of pl.Expr, optional :param status: Required when *group_filter* is provided. :type status: {"Wins", "Losses"}, optional :param top_k: Limit the number of rows returned (group_filter mode only). :type top_k: int, optional :rtype: pl.LazyFrame .. py:method:: _winning_from(interactions: polars.LazyFrame, win_rank: int, groupby_cols: list[str], top_k: int = 20, additional_filters: polars.Expr | list[polars.Expr] | None = None) -> polars.LazyFrame Actions beaten by the comparison group in its winning interactions. :param interactions: Interaction IDs where the comparison group wins. :param win_rank: Rank threshold used to define "winning". :param groupby_cols: Columns to group by (e.g. ``["Issue", "Group", "Action"]``). :param top_k: Return only the top-k entries. :param additional_filters: Optional extra filters (e.g. channel filter). .. py:method:: _losing_to(interactions: polars.LazyFrame, win_rank: int, groupby_cols: list[str], top_k: int = 20, additional_filters: polars.Expr | list[polars.Expr] | None = None) -> polars.LazyFrame Actions that beat the comparison group in its losing interactions. :param interactions: Interaction IDs where the comparison group loses. :param win_rank: Rank threshold used to define "winning". :param groupby_cols: Columns to group by (e.g. ``["Issue", "Group", "Action"]``). :param top_k: Return only the top-k entries. :param additional_filters: Optional extra filters (e.g. channel filter). .. py:method:: get_winning_or_losing_interactions(group_filter: polars.Expr | list[polars.Expr], win: bool, additional_filters: polars.Expr | list[polars.Expr] | None = None) -> polars.LazyFrame Interaction IDs where the comparison group wins or loses. :param group_filter: Filter defining the comparison group. :type group_filter: pl.Expr or list of pl.Expr :param win: If True, return interactions where the group wins (there are lower-ranked actions outside the group). If False, return interactions where the group loses (there are higher-ranked actions outside the group). :type win: bool :param additional_filters: Extra filters (e.g. channel filter). :type additional_filters: pl.Expr or list of pl.Expr, optional :returns: Single-column frame of unique ``Interaction ID`` values. :rtype: pl.LazyFrame .. py:method:: get_win_loss_counts(group_filter: polars.Expr | list[polars.Expr], win_rank: int = 1, additional_filters: polars.Expr | list[polars.Expr] | None = None) -> dict[str, int] Count wins and losses for the comparison group at a given rank threshold. A **win** is an interaction where at least one member of the comparison group achieves a rank of *win_rank* or better (lower). A **loss** is any interaction where the group participates but none rank that high. :param group_filter: Filter expression(s) defining the comparison group. :param win_rank: Rank threshold. The group "wins" when its best rank <= win_rank. :param additional_filters: Optional extra filters (e.g. channel/direction). :rtype: dict with keys ``"wins"``, ``"losses"``, ``"total"``. .. py:method:: get_win_loss_distributions(interactions_win: polars.LazyFrame, interactions_loss: polars.LazyFrame, groupby_cols: list[str], top_k: int, additional_filters: polars.Expr | list[polars.Expr] | None = None, group_filter: polars.Expr | list[polars.Expr] | None = None, win_rank: int | None = None) -> tuple[polars.LazyFrame, polars.LazyFrame] Distribution of actions the comparison group wins from and loses to. Computes two aggregated distributions in a single pass over the stage-filtered data: * **winning_from**: actions ranked below the comparison group (i.e. actions it beats). * **losing_to**: actions ranked above the comparison group (i.e. actions that beat it). Either *group_filter* or *win_rank* must be provided to define the rank boundary. When *group_filter* is given the boundary is the per-interaction best/worst rank of the selected group. When only *win_rank* is given a fixed rank threshold is used instead. :param interactions_win: Interaction IDs where the comparison group wins (from ``get_winning_or_losing_interactions(win=True)``). :param interactions_loss: Interaction IDs where the comparison group loses (from ``get_winning_or_losing_interactions(win=False)``). :param groupby_cols: Columns to group by (e.g. ``["Action"]``). :param top_k: Return only the top-k entries per distribution. :param additional_filters: Optional extra filters (e.g. channel/direction). :param group_filter: Filter expression(s) defining the comparison group. :param win_rank: Fixed rank threshold. Required when *group_filter* is ``None``. :returns: * tuple of (winning_from, losing_to), both ``pl.LazyFrame`` with * columns from *groupby_cols* plus ``Decisions``. .. py:method:: get_win_distribution_data(lever_condition: polars.Expr, lever_value: float | None = None, all_interactions: int | None = None) -> polars.DataFrame Calculate win distribution data for business lever analysis. This method generates distribution data showing how actions perform in arbitration decisions, both in baseline conditions and optionally with lever adjustments applied. :param lever_condition: Polars expression defining which actions to apply the lever to. Example: pl.col("Action") == "SpecificAction" or (pl.col("Issue") == "Service") & (pl.col("Group") == "Cards") :type lever_condition: pl.Expr :param lever_value: The lever multiplier value to apply to selected actions. If None, returns baseline distribution only. If provided, returns both original and lever-adjusted win counts. :type lever_value: float, optional :param all_interactions: Total number of interactions to calculate "no winner" count. If provided, enables calculation of interactions without any winner. If None, "no winner" data is not calculated. :type all_interactions: int, optional :returns: DataFrame containing win distribution with columns: - Issue, Group, Action: Action identifiers - original_win_count: Number of rank-1 wins in baseline scenario - new_win_count: Number of rank-1 wins after lever adjustment (only if lever_value provided) - n_decisions_survived_to_arbitration: Number of arbitration decisions the action participated in - selected_action: "Selected" for actions matching lever_condition, "Rest" for others - no_winner_count: Number of interactions without any winner (only if all_interactions provided) :rtype: pl.DataFrame .. rubric:: Notes - Only includes actions that survive to arbitration stage - Win counts represent rank-1 (first place) finishes in arbitration decisions - This is a zero-sum analysis: boosting selected actions suppresses others - Results are sorted by win count (new_win_count if available, else original_win_count) - When all_interactions is provided, "no winner" represents interactions without any rank-1 winner .. rubric:: Examples Get baseline distribution for a specific action: >>> lever_cond = pl.col("Action") == "MyAction" >>> baseline = decision_analyzer.get_win_distribution_data(lever_cond) Get distribution with 2x lever applied to service actions: >>> lever_cond = pl.col("Issue") == "Service" >>> with_lever = decision_analyzer.get_win_distribution_data(lever_cond, 2.0) Get distribution with no winner count: >>> total_interactions = 10000 >>> with_no_winner = decision_analyzer.get_win_distribution_data(lever_cond, 2.0, total_interactions) .. py:method:: find_lever_value(lever_condition: polars.Expr, target_win_percentage: float, win_rank: int = 1, low: float = 0, high: float = 100, precision: float = 0.01, ranking_stages: list[str] | None = None) -> float Binary search algorithm to find lever value needed to achieve a desired win percentage. :param lever_condition: Polars expression that defines which actions should receive the lever :type lever_condition: pl.Expr :param target_win_percentage: The desired win percentage (0-100) :type target_win_percentage: float :param win_rank: Consider actions winning if they rank <= this value :type win_rank: int, default 1 :param low: Lower bound for lever search range :type low: float, default 0 :param high: Upper bound for lever search range :type high: float, default 100 :param precision: Search precision - smaller values give more accurate results :type precision: float, default 0.01 :param ranking_stages: List of stages to include in analysis. Defaults to ["Arbitration"] :type ranking_stages: list[str], optional :returns: The lever value needed to achieve the target win percentage :rtype: float :raises ValueError: If the target win percentage cannot be achieved within the search range