pdstools.adm.BinAggregator ========================== .. py:module:: pdstools.adm.BinAggregator Classes ------- .. autoapisummary:: pdstools.adm.BinAggregator.BinAggregator Module Contents --------------- .. py:class:: BinAggregator(dm: pdstools.adm.ADMDatamart.ADMDatamart) Bases: :py:obj:`pdstools.utils.namespaces.LazyNamespace` A class to generate rolled up insights from ADM predictor binning. .. py:attribute:: dependencies :value: ['plotly', 'numpy'] .. py:method:: roll_up(predictors: Union[str, list], *, n: int = 10, distribution: Literal['lin', 'log'] = 'lin', boundaries: Optional[Union[float, list]] = None, symbols: Optional[Union[str, list]] = None, minimum: Optional[float] = None, maximum: Optional[float] = None, aggregation: Optional[str] = None, as_numeric: Optional[bool] = None, return_df: bool = False, verbose: bool = False) -> Union[polars.DataFrame, Figure] Roll up a predictor across all the models defined when creating the class. Predictors can be both numeric and symbolic (also called 'categorical'). You can aggregate the same predictor across different sets of models by specifying a column name in the aggregation argument. :param predictors: Name of the predictor to roll up. Multiple predictors can be passed in as a list. :type predictors: str | list :param n: Number of bins (intervals or symbols) to generate, by default 10. Any custom intervals or symbols specified with the 'musthave' argument will count towards this number as well. For symbolic predictors can be None, which means unlimited. :type n: int, optional :param distribution: For numeric predictors: the way the intervals are constructed. By default "lin" for an evenly-spaced distribution, can be set to "log" for a long tailed distribution (for fields like income). :type distribution: str, optional :param boundaries: For numeric predictors: one value, or a list of the numeric values to include as interval boundaries. They will be used at the front of the automatically created intervals. By default None, all intervals are created automatically. :type boundaries: float | list, optional :param symbols: For symbolic predictors, any symbol(s) that must be included in the symbol list in the generated binning. By default None. :type symbols: str | list, optional :param minimum: Minimum value for numeric predictors, by default None. When None the minimum is taken from the binning data of the models. :type minimum: float, optional :param maximum: Maximum value for numeric predictors, by default None. When None the maximum is taken from the binning data of the models. :type maximum: float, optional :param aggregation: Optional column name in the data to aggregate over, creating separate aggregations for each of the different values. By default None. :type aggregation: str, optional :param as_numeric: Optional override for the type of the predictor, so to be able to override in the (exceptional) situation that a predictor with the same name is numeric in some and symbolic in some other models. By default None which means the type is taken from the first predictor in the data. :type as_numeric: bool, optional :param return_df: Return the underlying binning instead of a plot. :type return_df: bool, optional :param verbose: Show detailed debug information while executing, by default False :type verbose: bool, optional :returns: By default returns a nicely formatted plot. When 'return_df' is set to True, it returns the actual binning with the lift aggregated over all the models, optionally per predictor and per set of models. :rtype: pl.DataFrame | Figure .. py:method:: accumulate_num_binnings(predictor, modelids, target_binning, verbose=False) -> polars.DataFrame .. py:method:: create_symbol_list(predictor, n_symbols, musthave_symbols) -> list .. py:method:: accumulate_sym_binnings(predictor, modelids, symbollist, verbose=False) -> polars.DataFrame .. py:method:: normalize_all_binnings(combined_dm: polars.LazyFrame) -> polars.LazyFrame Prepare all predictor binning Fix up the boundaries for numeric bins and parse the bin labels into clean lists for symbolics. .. py:method:: create_empty_numbinning(predictor: str, n: int, distribution: str = 'lin', boundaries: Optional[list] = None, minimum: Optional[float] = None, maximum: Optional[float] = None) -> polars.DataFrame .. py:method:: get_source_numbinning(predictor: str, modelid: str) -> polars.DataFrame .. py:method:: combine_two_numbinnings(source: polars.DataFrame, target: polars.DataFrame, verbose=False) -> polars.DataFrame .. py:method:: plot_binning_attribution(source: polars.DataFrame, target: polars.DataFrame) -> Figure .. py:method:: plot_binning_lift(binning, col_facet=None, row_facet=None, custom_data=['PredictorName', 'BinSymbol'], return_df=False) -> Union[polars.DataFrame, Figure] .. py:method:: plot_lift_binning(binning: polars.DataFrame) -> Figure