pdstools.ih.Aggregates

Aggregation methods for Interaction History analysis.

Classes

Aggregates

Aggregation methods for Interaction History data.

Module Contents

class Aggregates(ih: pdstools.ih.IH.IH)

Bases: pdstools.utils.namespaces.LazyNamespace

Aggregation methods for Interaction History data.

This class provides aggregation capabilities for summarizing customer interaction data. It is accessed through the aggregates attribute of an IH instance.

All aggregation methods support: - Grouping by dimensions via by parameter - Time bucketing via every parameter - Data filtering via query parameter

Parameters:

ih (pdstools.ih.IH.IH)

ih

Reference to the parent IH instance.

Type:

IH

See also

pdstools.ih.IH

Main analysis class.

pdstools.ih.Plots

Visualization methods.

Examples

>>> ih = IH.from_ds_export("interaction_history.zip")
>>> ih.aggregates.summary_success_rates(by="Channel").collect()
>>> ih.aggregates.summary_outcomes(every="1w").collect()
ih
summarize_by_interaction(by: str | List[str] | polars.Expr | None = None, every: str | datetime.timedelta | None = None, query: pdstools.utils.types.QUERY | None = None, debug: bool = False) polars.LazyFrame

Summarize outcomes per interaction.

Groups data by interaction ID and determines the outcome for each interaction based on configured positive/negative outcome labels.

Parameters:
  • by (str, List[str], or pl.Expr, optional) – Grouping dimension(s). Default is None.

  • every (str or timedelta, optional) – Time aggregation period (e.g., “1d”, “1w”). Default is None.

  • query (QUERY, optional) – Polars expression to filter data before aggregation.

  • debug (bool, default False) – If True, include debug columns (Outcomes list).

Returns:

Interaction-level data with columns:

  • InteractionID: Unique interaction identifier

  • Interaction_Outcome_<metric>: Boolean outcome per metric

  • Propensity: Propensity value from interaction

  • Plus any grouping columns

Return type:

pl.LazyFrame

Notes

An interaction has a positive outcome for a metric if any outcome matches the positive labels. Otherwise, if any matches negative labels, the outcome is False. Otherwise it’s null.

See also

summary_success_rates

Aggregated success rates.

Examples

>>> ih.aggregates.summarize_by_interaction(by="Channel").collect()
summary_success_rates(by: str | List[str] | polars.Expr | None = None, every: str | datetime.timedelta | None = None, query: pdstools.utils.types.QUERY | None = None, debug: bool = False) polars.LazyFrame

Calculate success rates with standard errors.

Aggregates interactions into success rates for each configured metric, with standard errors for statistical significance assessment.

Parameters:
  • by (str, List[str], or pl.Expr, optional) – Grouping dimension(s). Default is None.

  • every (str or timedelta, optional) – Time aggregation period (e.g., “1d”, “1w”). Default is None.

  • query (QUERY, optional) – Polars expression to filter data before aggregation.

  • debug (bool, default False) – If True, include debug columns.

Returns:

Success rate summary with columns:

  • Positives_<metric>: Count of positive outcomes

  • Negatives_<metric>: Count of negative outcomes

  • Interactions: Total interaction count

  • SuccessRate_<metric>: Positive / (Positive + Negative)

  • StdErr_<metric>: Standard error of the proportion

  • Plus any grouping columns

Return type:

pl.LazyFrame

Notes

Standard error is calculated as:

\[SE = \sqrt{\frac{p(1-p)}{n}}\]

where p is the success rate and n is the sample size.

See also

summarize_by_interaction

Interaction-level outcomes.

summary_outcomes

Outcome counts.

Examples

>>> ih.aggregates.summary_success_rates(by="Channel", every="1w").collect()
summary_outcomes(by: str | List[str] | polars.Expr | None = None, every: str | datetime.timedelta | None = None, query: pdstools.utils.types.QUERY | None = None) polars.LazyFrame

Count outcomes by type.

Aggregates outcome counts, useful for understanding the distribution of outcomes across dimensions or time periods.

Parameters:
  • by (str, List[str], or pl.Expr, optional) – Grouping dimension(s). Default is None.

  • every (str or timedelta, optional) – Time aggregation period (e.g., “1d”, “1w”). Default is None.

  • query (QUERY, optional) – Polars expression to filter data before aggregation.

Returns:

Outcome counts with columns:

  • Outcome: Outcome type label

  • Count: Number of occurrences

  • Plus any grouping columns

Return type:

pl.LazyFrame

See also

summary_success_rates

Success rates by metric.

Examples

>>> ih.aggregates.summary_outcomes(by="Channel").collect()
>>> ih.aggregates.summary_outcomes(every="1mo").collect()