pdstools.adm.Aggregates ======================= .. py:module:: pdstools.adm.Aggregates Classes ------- .. autoapisummary:: pdstools.adm.Aggregates.Aggregates Module Contents --------------- .. py:class:: Aggregates(datamart: pdstools.adm.ADMDatamart.ADMDatamart) .. py:attribute:: datamart .. py:attribute:: cdh_guidelines .. py:method:: last(*, data: Optional[polars.LazyFrame] = None, table: Literal['model_data', 'predictor_data', 'combined_data'] = 'model_data') Gets the last snapshot of the given table :param data: If provided, subsets to just that dataframe, by default None :type data: Optional[pl.LazyFrame], optional :param table: If provided, specifies the table to get data from, by default "model_data" :type table: Literal['model_data', 'predictor_data', 'combined_data'], optional :returns: _description_ :rtype: _type_ .. py:method:: _combine_data(model_df: Optional[polars.LazyFrame], predictor_df: Optional[polars.LazyFrame]) -> Optional[polars.LazyFrame] Combines the model and predictor tables to the `combined_data` attribute :param model_df: The model snapshots table :type model_df: pl.LazyFrame :param predictor_df: The predictor binning snapshots table :type predictor_df: pl.LazyFrame :returns: The resulting data, joined on the ModelID column :rtype: pl.LazyFrame .. py:method:: predictor_performance_pivot(*, query: Optional[pdstools.utils.types.QUERY] = None, active_only: bool = False, by='Name', top_predictors: Optional[int] = None, top_groups: Optional[int] = None) -> polars.LazyFrame Creates a pivot table of the predictor performance per 'group' :param query: A query to apply to the data before creating the pivot, by default None :type query: Optional[QUERY], optional :param by: A group by which to 'facet', by default "Name". If, for instance, the 'by' argument is set to 'Configuration', each row will be a distinct configuration :type by: str, optional :param top_predictors: Specify the maximum number of predictors, by default None :type top_predictors: Optional[int], optional :param top_groups: Specify the maximum number of 'groups' specified in the 'by' argument, by default None :type top_groups: Optional[int], optional :returns: A LazyFrame with a column for each predictor, and a row for each 'group'. The values represent the weighted performance for that predictor :rtype: pl.LazyFrame .. py:method:: model_summary(by: str = 'Name', query: Optional[pdstools.utils.types.QUERY] = None) -> polars.LazyFrame Generate a summary of statistic for each model (based on model ID) If you want to generate statistics at a model name or treatment level, specify this in the 'by' column. :param by: The column to define the 'counts' for, by default "ModelID" Must be part of the context keys in the ADMDatamart class :type by: str, optional :param query: A query to apply to the data before summarization, by default None :type query: Optional[QUERY], optional :returns: A LazyFrame, with one row for each context key combination :rtype: pl.LazyFrame .. py:method:: predictor_counts(*, facet: str = 'Configuration', by: str = 'Type', query: Optional[pdstools.utils.types.QUERY] = None) -> polars.LazyFrame Returns the count of each predictor grouped by a certain column :param by: The column to group the data by, by default "Type" :type by: str, optional :param query: A query to apply to the data, by default None :type query: Optional[QUERY], optional :returns: A LazyFrame, with one row per predictor and 'by' combo :rtype: pl.LazyFrame .. py:method:: _top_n(df: polars.DataFrame, top_n: int, metric: str = 'PredictorPerformance', facets: Optional[list] = None) :staticmethod: Subsets DataFrame to contain only top_n predictors. :param df: Table to subset :type df: pl.DataFrame :param top_n: Number of top predictors :type top_n: int :param metric: Metric to use for comparing predictors :type metric: str :param facets: Subsets top_n predictors over facets. Seperate top predictors for each facet :type facets: list :returns: Subsetted dataframe :rtype: pl.DataFrame .. py:method:: summary_by_channel(custom_channels: Optional[Dict[str, str]] = None, by_period: Optional[str] = None, keep_lists: bool = False) -> polars.LazyFrame Summarize ADM models per channel :param custom_channels: Optional list with custom channel/direction name mappings. Defaults to None. :type custom_channels: Dict[str, str], optional :param by_period: Optional grouping by time period. Format string as in polars.Expr.dt.truncate (https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.dt.truncate.html), for example "1mo", "1w", "1d" for calendar month, week day. If provided, creates a new Period column with the truncated date/time. Defaults to None. :type by_period: str, optional :param keep_lists: Internal flag to keep some columns (action and treatment names etc) as full lists. :type keep_lists: bool, optional :returns: Dataframe with summary per channel (and optionally a period) :rtype: pl.LazyFrame .. py:method:: summary_by_configuration() -> polars.DataFrame Generates a summary of the ADM model configurations. :returns: A Polars DataFrame containing the configuration summary. :rtype: pl.DataFrame .. py:method:: predictors_overview() -> Optional[polars.DataFrame] Generate a summary of the last snapshot of predictor data. This method creates a summary of predictor data by joining the last snapshots of predictor_data and model_data, then performing various aggregations and calculations. It excludes the "Classifier" predictor from the analysis. :returns: A Polars DataFrame containing the predictor summary if successful, None if the required data is not available. :rtype: pl.DataFrame or None .. py:method:: overall_summary(custom_channels: Dict[str, str] = None, by_period: str = None) -> polars.LazyFrame Overall ADM models summary. Only valid data is included. :param custom_channels: Optional list with custom channel/direction name mappings. Defaults to None. :type custom_channels: Dict[str, str], optional :param by_period: Optional grouping by time period. Format string as in polars.Expr.dt.truncate (https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.dt.truncate.html), for example "1mo", "1w", "1d" for calendar month, week day. If provided, creates a new Period column with the truncated date/time. Defaults to None. :type by_period: str, optional :returns: Summary across all valid ADM models as a dataframe :rtype: pl.LazyFrame