pdstools.ih.Aggregates

Classes

Module Contents

class Aggregates(ih: pdstools.ih.IH.IH)

Bases: pdstools.utils.namespaces.LazyNamespace

Parameters:

ih (pdstools.ih.IH.IH)

ih
summarize_by_interaction(by: str | List[str] | polars.Expr | None = None, every: str | datetime.timedelta | None = None, query: pdstools.utils.types.QUERY | None = None, debug: bool = False) polars.LazyFrame

Groups the IH data by interaction ID and summarizes outcomes for each interaction.

It optionally groups by one or more dimensions (e.g. Experiment, Channel, Issue etc). When given, the ‘every’ argument is used to divide the timerange into buckets. It uses the same string language as Polars.

For each interaction, it determines whether any outcomes match the positive or negative outcome labels for each metric. An interaction is considered to have a positive outcome for a metric if any of its outcomes are in the positive labels for that metric, and negative if any are in the negative labels.

Parameters:
  • by (Optional[Union[str, List[str], pl.Expr]], optional) – Grouping keys. Name of field(s) or a Polars expression, by default None

  • every (Optional[Union[str, timedelta]], optional) – Every interval start and period length, by default None

  • query (Optional[QUERY], optional) – Query to filter the data before aggregation, by default None

  • debug (bool, optional) – Whether to include debug information in the output, by default False

Returns:

A polars frame with interaction-level outcome data, including columns for each metric’s outcome and the propensity value.

Return type:

pl.LazyFrame

summary_success_rates(by: str | List[str] | polars.Expr | None = None, every: str | datetime.timedelta | None = None, query: pdstools.utils.types.QUERY | None = None, debug: bool = False) polars.LazyFrame

Groups the IH data summarizing into success rates (SuccessRate) and standard error (StdErr).

It optionally groups by one or more dimensions (e.g. Experiment, Channel, Issue etc). When given, the ‘every’ argument is used to divide the timerange into buckets. It uses the same string language as Polars.

Every interaction is considered to have only one outcome: positive, negative or none. When any outcome in the interaction is in the positive labels, the outcome is considered positive. Next, when any is in the negative labels, the outcome of the interaction is considered negative. Otherwise there is no defined outcome and the interaction is ignored in calculations of success rate or error.

Parameters:
  • by (Optional[Union[str, List[str], pl.Expr]], optional) – Grouping keys. Name of field(s) or a Polars expression, by default None

  • every (Optional[str], optional) – Every interval start and period length, by default None

  • query (Optional[pdstools.utils.types.QUERY])

  • debug (bool)

Returns:

A polars frame with the grouping keys and columns for the total number of Positives, Negatives, number of Interactions, success rate (SuccessRate) and standard error (StdErr).

Return type:

pl.LazyFrame

summary_outcomes(by: str | List[str] | polars.Expr | None = None, every: str | datetime.timedelta | None = None, query: pdstools.utils.types.QUERY | None = None) polars.LazyFrame

Groups the IH data by outcome and summarizes counts for each outcome type.

It optionally groups by one or more dimensions (e.g. Experiment, Channel, Issue etc). When given, the ‘every’ argument is used to divide the timerange into buckets. It uses the same string language as Polars.

This method provides a count of each outcome type, which can be useful for understanding the distribution of outcomes across different dimensions or time periods.

Parameters:
  • by (Optional[Union[str, List[str], pl.Expr]], optional) – Grouping keys. Name of field(s) or a Polars expression, by default None

  • every (Optional[Union[str, timedelta]], optional) – Every interval start and period length, by default None

  • query (Optional[QUERY], optional) – Query to filter the data before aggregation, by default None

Returns:

A polars frame with the grouping keys, outcome types, and a count column showing the number of occurrences for each outcome type within each group.

Return type:

pl.LazyFrame