pdstools.prediction.Prediction¶
Attributes¶
Classes¶
Monitor Pega Prediction Studio Predictions |
Module Contents¶
- logger¶
- COLORSCALE_TYPES¶
- Figure¶
- class PredictionPlots(prediction)¶
Bases:
pdstools.utils.namespaces.LazyNamespace
- dependencies = ['plotly']¶
- prediction¶
- _prediction_trend(period: str, query: pdstools.utils.types.QUERY | None, return_df: bool, metric: str, title: str, facet_row: str = None, facet_col: str = None, bar_mode: bool = False)¶
- performance_trend(period: str = '1d', *, query: pdstools.utils.types.QUERY | None = None, return_df: bool = False)¶
- lift_trend(period: str = '1d', *, query: pdstools.utils.types.QUERY | None = None, return_df: bool = False)¶
- class Prediction(df: polars.LazyFrame, *, query: pdstools.utils.types.QUERY | None = None)¶
Monitor Pega Prediction Studio Predictions
- Parameters:
df (polars.LazyFrame)
query (Optional[pdstools.utils.types.QUERY])
- predictions: polars.LazyFrame¶
- plot: PredictionPlots¶
- prediction_validity_expr¶
- cdh_guidelines¶
- classmethod from_pdc(df: polars.LazyFrame)¶
- Parameters:
df (polars.LazyFrame)
- static from_mock_data(days=70)¶
- summary_by_channel(custom_predictions: List[List] | None = None, *, start_date: datetime.datetime | None = None, end_date: datetime.datetime | None = None, window: int | datetime.timedelta | None = None, by_period: str | None = None, debug: bool = False) polars.LazyFrame ¶
Summarize prediction per channel
- Parameters:
custom_predictions (Optional[List[CDH_Guidelines.NBAD_Prediction]], optional) – Optional list with custom prediction name to channel mappings. Defaults to None.
start_date (datetime.datetime, optional) – Start date of the summary period. If None (default) uses the end date minus the window, or if both absent, the earliest date in the data
end_date (datetime.datetime, optional) – End date of the summary period. If None (default) uses the start date plus the window, or if both absent, the latest date in the data
window (int or datetime.timedelta, optional) – Number of days to use for the summary period or an explicit timedelta. If None (default) uses the whole period. Can’t be given if start and end date are also given.
by_period (str, optional) – Optional additional grouping by time period. Format string as in polars.Expr.dt.truncate (https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.dt.truncate.html), for example “1mo”, “1w”, “1d” for calendar month, week day. Defaults to None.
debug (bool, optional) – If True, enables debug mode for additional logging or outputs. Defaults to False.
- Returns:
Summary across all Predictions as a dataframe with the following fields:
Time and Configuration Fields: - DateRange Min - The minimum date in the summary time range - DateRange Max - The maximum date in the summary time range - Duration - The duration in seconds between the minimum and maximum snapshot times - Prediction: The prediction name - Channel: The channel name - Direction: The direction (e.g., Inbound, Outbound) - ChannelDirectionGroup: Combined Channel/Direction identifier - isValid: Boolean indicating if the prediction data is valid - isStandardNBADPrediction: Boolean indicating if this is a standard NBAD prediction - isMultiChannelPrediction: Boolean indicating if this is a multi-channel prediction - ControlPercentage: Percentage of responses in control group - TestPercentage: Percentage of responses in test group
Performance Metrics: - Performance: Weighted model performance (AUC) - Positives: Sum of positive responses - Negatives: Sum of negative responses - Responses: Sum of all responses - Positives_Test: Sum of positive responses in test group - Positives_Control: Sum of positive responses in control group - Positives_NBA: Sum of positive responses in NBA group - Negatives_Test: Sum of negative responses in test group - Negatives_Control: Sum of negative responses in control group - Negatives_NBA: Sum of negative responses in NBA group - CTR: Click-through rate (Positives / (Positives + Negatives)) - CTR_Test: Click-through rate for test group - CTR_Control: Click-through rate for control group - CTR_NBA: Click-through rate for NBA group - Lift: Lift value ((CTR_Test - CTR_Control) / CTR_Control)
Technology Usage Indicators: - usesImpactAnalyzer: Boolean indicating if Impact Analyzer is used
- Return type:
pl.LazyFrame
- overall_summary(custom_predictions: List[List] | None = None, *, start_date: datetime.datetime | None = None, end_date: datetime.datetime | None = None, window: int | datetime.timedelta | None = None, by_period: str | None = None, debug: bool = False) polars.LazyFrame ¶
Overall prediction summary. Only valid prediction data is included.
- Parameters:
custom_predictions (Optional[List[CDH_Guidelines.NBAD_Prediction]], optional) – Optional list with custom prediction name to channel mappings. Defaults to None.
start_date (datetime.datetime, optional) – Start date of the summary period. If None (default) uses the end date minus the window, or if both absent, the earliest date in the data
end_date (datetime.datetime, optional) – End date of the summary period. If None (default) uses the start date plus the window, or if both absent, the latest date in the data
window (int or datetime.timedelta, optional) – Number of days to use for the summary period or an explicit timedelta. If None (default) uses the whole period. Can’t be given if start and end date are also given.
by_period (str, optional) – Optional additional grouping by time period. Format string as in polars.Expr.dt.truncate (https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.dt.truncate.html), for example “1mo”, “1w”, “1d” for calendar month, week day. Defaults to None.
debug (bool, optional) – If True, enables debug mode for additional logging or outputs. Defaults to False.
- Returns:
Summary across all Predictions as a dataframe with the following fields:
Time and Configuration Fields: - DateRange Min - The minimum date in the summary time range - DateRange Max - The maximum date in the summary time range - Duration - The duration in seconds between the minimum and maximum snapshot times - ControlPercentage: Weighted average percentage of control group responses - TestPercentage: Weighted average percentage of test group responses
Performance Metrics: - Performance: Weighted average performance across all valid channels - Positives Inbound: Sum of positive responses across all valid inbound channels - Positives Outbound: Sum of positive responses across all valid outbound channels - Responses Inbound: Sum of all responses across all valid inbound channels - Responses Outbound: Sum of all responses across all valid outbound channels - Overall Lift: Weighted average lift across all valid channels - Minimum Negative Lift: The lowest negative lift value found
Channel Statistics: - Number of Valid Channels: Count of unique valid channel/direction combinations - Channel with Minimum Negative Lift: Channel with the lowest negative lift value
Technology Usage Indicators: - usesImpactAnalyzer: Boolean indicating if any channel uses Impact Analyzer
- Return type:
pl.LazyFrame