pdstools.ih.Aggregates¶
Aggregation methods for Interaction History analysis.
Classes¶
Aggregation methods for Interaction History data. |
Module Contents¶
- class Aggregates(ih: pdstools.ih.IH.IH)¶
Bases:
pdstools.utils.namespaces.LazyNamespaceAggregation methods for Interaction History data.
This class provides aggregation capabilities for summarizing customer interaction data. It is accessed through the aggregates attribute of an
IHinstance.All aggregation methods support: - Grouping by dimensions via by parameter - Time bucketing via every parameter - Data filtering via query parameter
- Parameters:
ih (pdstools.ih.IH.IH)
See also
pdstools.ih.IHMain analysis class.
pdstools.ih.PlotsVisualization methods.
Examples
>>> ih = IH.from_ds_export("interaction_history.zip") >>> ih.aggregates.summary_success_rates(by="Channel").collect() >>> ih.aggregates.summary_outcomes(every="1w").collect()
- ih¶
- summarize_by_interaction(by: str | List[str] | polars.Expr | None = None, every: str | datetime.timedelta | None = None, query: pdstools.utils.types.QUERY | None = None, debug: bool = False) polars.LazyFrame¶
Summarize outcomes per interaction.
Groups data by interaction ID and determines the outcome for each interaction based on configured positive/negative outcome labels.
- Parameters:
by (str, List[str], or pl.Expr, optional) – Grouping dimension(s). Default is None.
every (str or timedelta, optional) – Time aggregation period (e.g., “1d”, “1w”). Default is None.
query (QUERY, optional) – Polars expression to filter data before aggregation.
debug (bool, default False) – If True, include debug columns (Outcomes list).
- Returns:
Interaction-level data with columns:
InteractionID: Unique interaction identifier
Interaction_Outcome_<metric>: Boolean outcome per metric
Propensity: Propensity value from interaction
Plus any grouping columns
- Return type:
pl.LazyFrame
Notes
An interaction has a positive outcome for a metric if any outcome matches the positive labels. Otherwise, if any matches negative labels, the outcome is False. Otherwise it’s null.
See also
summary_success_ratesAggregated success rates.
Examples
>>> ih.aggregates.summarize_by_interaction(by="Channel").collect()
- summary_success_rates(by: str | List[str] | polars.Expr | None = None, every: str | datetime.timedelta | None = None, query: pdstools.utils.types.QUERY | None = None, debug: bool = False) polars.LazyFrame¶
Calculate success rates with standard errors.
Aggregates interactions into success rates for each configured metric, with standard errors for statistical significance assessment.
- Parameters:
by (str, List[str], or pl.Expr, optional) – Grouping dimension(s). Default is None.
every (str or timedelta, optional) – Time aggregation period (e.g., “1d”, “1w”). Default is None.
query (QUERY, optional) – Polars expression to filter data before aggregation.
debug (bool, default False) – If True, include debug columns.
- Returns:
Success rate summary with columns:
Positives_<metric>: Count of positive outcomes
Negatives_<metric>: Count of negative outcomes
Interactions: Total interaction count
SuccessRate_<metric>: Positive / (Positive + Negative)
StdErr_<metric>: Standard error of the proportion
Plus any grouping columns
- Return type:
pl.LazyFrame
Notes
Standard error is calculated as:
\[SE = \sqrt{\frac{p(1-p)}{n}}\]where p is the success rate and n is the sample size.
See also
summarize_by_interactionInteraction-level outcomes.
summary_outcomesOutcome counts.
Examples
>>> ih.aggregates.summary_success_rates(by="Channel", every="1w").collect()
- summary_outcomes(by: str | List[str] | polars.Expr | None = None, every: str | datetime.timedelta | None = None, query: pdstools.utils.types.QUERY | None = None) polars.LazyFrame¶
Count outcomes by type.
Aggregates outcome counts, useful for understanding the distribution of outcomes across dimensions or time periods.
- Parameters:
- Returns:
Outcome counts with columns:
Outcome: Outcome type label
Count: Number of occurrences
Plus any grouping columns
- Return type:
pl.LazyFrame
See also
summary_success_ratesSuccess rates by metric.
Examples
>>> ih.aggregates.summary_outcomes(by="Channel").collect() >>> ih.aggregates.summary_outcomes(every="1mo").collect()