pdstools.adm.Plots ================== .. py:module:: pdstools.adm.Plots Classes ------- .. autoapisummary:: pdstools.adm.Plots.Plots Module Contents --------------- .. py:class:: Plots(datamart: pdstools.adm.ADMDatamart.ADMDatamart) Bases: :py:obj:`pdstools.utils.namespaces.LazyNamespace` .. py:attribute:: dependencies :value: ['plotly'] .. py:attribute:: dependency_group :value: 'adm' .. py:attribute:: datamart .. py:method:: bubble_chart(*, last: bool = True, rounding: int = 5, query: pdstools.utils.types.QUERY | None = None, facet: str | polars.Expr | None = None, color: str | None = 'Performance', show_metric_limits: bool = False, return_df: bool = False) The Bubble Chart, as seen in Prediction Studio :param last: Whether to only include the latest snapshot, by default True :type last: bool, optional :param rounding: To how many digits to round the performance number :type rounding: int, optional :param query: The query to apply to the data, by default None :type query: Optional[QUERY], optional :param facet: Column name or Polars expression to facet the plot into subplots, by default None :type facet: Optional[Union[str, pl.Expr]], optional :param show_metric_limits: Whether to show dashed vertical lines at the ModelPerformance metric limit thresholds (from MetricLimits.csv), by default False :type show_metric_limits: bool, optional :param return_df: Whether to return a dataframe instead of a plot, by default False :type return_df: bool, optional .. py:method:: over_time(metric: str = 'Performance', by: polars.Expr | str = 'ModelID', *, every: str | datetime.timedelta = '1d', cumulative: bool = True, query: pdstools.utils.types.QUERY | None = None, facet: str | None = None, show_metric_limits: bool = False, return_df: bool = False) Statistics over time :param metric: The metric to plot, by default "Performance" :type metric: str, optional :param by: The column to group by, by default "ModelID" :type by: Union[pl.Expr, str], optional :param every: By what time period to group, by default "1d", see https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.dt.truncate.html for periods. :type every: Union[str, timedelta], optional :param cumulative: Whether to show cumulative values or period-over-period changes, by default True :type cumulative: bool, optional :param query: The query to apply to the data, by default None :type query: Optional[QUERY], optional :param facet: Whether to facet the plot into subplots, by default None :type facet: Optional[str], optional :param show_metric_limits: Whether to show dashed horizontal lines at the metric limit thresholds (from MetricLimits.csv), by default False. Only applies when metric is "Performance". :type show_metric_limits: bool, optional :param return_df: Whether to return a dataframe instead of a plot, by default False :type return_df: bool, optional .. py:method:: proposition_success_rates(metric: str = 'SuccessRate', by: str = 'Name', *, top_n: int = 0, query: pdstools.utils.types.QUERY | None = None, facet: str | None = None, return_df: bool = False) Proposition Success Rates :param metric: The metric to plot, by default "SuccessRate" :type metric: str, optional :param by: By which column to group the, by default "Name" :type by: str, optional :param top_n: Whether to take a top_n on the `by` column, by default 0 :type top_n: int, optional :param query: A query to apply to the data, by default None :type query: Optional[QUERY], optional :param facet: What facetting column to apply to the graph, by default None :type facet: Optional[str], optional :param return_df: Whether to return a DataFrame instead of the graph, by default False :type return_df: bool, optional .. py:method:: score_distribution(model_id: str, *, active_range: bool = True, return_df: bool = False) Generate a score distribution plot for a specific model. :param model_id: The ID of the model to generate the score distribution for :type model_id: str :param active_range: Whether to filter to active score range only, by default True :type active_range: bool, optional :param return_df: Whether to return a dataframe instead of a plot, by default False :type return_df: bool, optional :returns: Plotly figure showing score distribution or DataFrame if return_df=True :rtype: Union[Figure, pl.LazyFrame] :raises ValueError: If no data is available for the provided model ID .. py:method:: multiple_score_distributions(query: pdstools.utils.types.QUERY | None = None, show_all: bool = True) -> list[Figure] Generate the score distribution plot for all models in the query :param query: A query to apply to the data, by default None :type query: Optional[QUERY], optional :param show_all: Whether to 'show' all plots or just get a list of them, by default True :type show_all: bool, optional :returns: A list of Plotly charts, one for each model instance :rtype: list[go.Figure] .. py:method:: predictor_binning(model_id: str, predictor_name: str, return_df: bool = False) Generate a predictor binning plot for a specific model and predictor. :param model_id: The ID of the model containing the predictor :type model_id: str :param predictor_name: Name of the predictor to analyze :type predictor_name: str :param return_df: Whether to return a dataframe instead of a plot, by default False :type return_df: bool, optional :returns: Plotly figure showing predictor binning or DataFrame if return_df=True :rtype: Union[Figure, pl.LazyFrame] :raises ValueError: If no data is available for the provided model ID and predictor name .. py:method:: multiple_predictor_binning(model_id: str, query: pdstools.utils.types.QUERY | None = None, show_all=True) -> list[Figure] Generate predictor binning plots for all predictors in a model. :param model_id: The ID of the model to generate predictor binning plots for :type model_id: str :param query: A query to apply to the predictor data, by default None :type query: Optional[QUERY], optional :param show_all: Whether to display all plots or just return the list, by default True :type show_all: bool, optional :returns: A list of Plotly figures, one for each predictor in the model :rtype: list[Figure] .. py:method:: predictor_performance(*, metric: str = 'Performance', top_n: int | None = None, active_only: bool = False, query: pdstools.utils.types.QUERY | None = None, return_df: bool = False) Plots a box plot of the performance of the predictors Use the query argument to drill down to a more specific subset If top n is given, chooses the top predictors based on the weighted average performance across models, ordered by their median performance. :param metric: The metric to plot, by default "Performance" This is more for future-proofing, once FeatureImportance gets more used. :type metric: str, optional :param top_n: The top n predictors to plot, by default None :type top_n: Optional[int], optional :param active_only: Whether to only consider active predictor performance, by default False :type active_only: bool, optional :param query: The query to apply to the data, by default None :type query: Optional[QUERY], optional :param return_df: Whether to return a dataframe instead of a plot, by default False :type return_df: bool, optional .. seealso:: :py:obj:`pdstools.adm.ADMDatamart.apply_predictor_categorization` how to override the out of the box predictor categorization .. py:method:: predictor_category_performance(*, metric: str = 'Performance', active_only: bool = False, query: pdstools.utils.types.QUERY | None = None, return_df: bool = False) Plot the predictor category performance :param metric: The metric to plot, by default "Performance" :type metric: str, optional :param active_only: Whether to only analyze active predictors, by default False :type active_only: bool, optional :param query: An optional query to apply, by default None :type query: Optional[QUERY], optional :param return_df: An optional flag to get the dataframe instead, by default False :type return_df: bool, optional :returns: A Plotly figure :rtype: px.Figure .. seealso:: :py:obj:`pdstools.adm.ADMDatamart.apply_predictor_categorization` how to override the out of the box predictor categorization .. py:method:: predictor_contribution(*, by: str = 'Configuration', query: pdstools.utils.types.QUERY | None = None, return_df: bool = False) Plots the predictor contribution for each configuration :param by: By which column to plot the contribution, by default "Configuration" :type by: str, optional :param query: An optional query to apply to the data, by default None :type query: Optional[QUERY], optional :param return_df: An optional flag to get a Dataframe instead, by default False :type return_df: bool, optional :returns: A plotly figure :rtype: px.Figure .. seealso:: :py:obj:`pdstools.adm.ADMDatamart.apply_predictor_categorization` how to override the out of the box predictor categorization .. py:method:: predictor_performance_heatmap(*, top_predictors: int = 20, top_groups: int | None = None, by: str = 'Name', active_only: bool = False, query: pdstools.utils.types.QUERY | None = None, return_df: bool = False) Generate a heatmap showing predictor performance across different groups. :param top_predictors: Number of top-performing predictors to include, by default 20 :type top_predictors: int, optional :param top_groups: Number of top groups to include, by default None (all groups) :type top_groups: int, optional :param by: Column to group by for the heatmap, by default "Name" :type by: str, optional :param active_only: Whether to only include active predictors, by default False :type active_only: bool, optional :param query: Optional query to filter the data, by default None :type query: Optional[QUERY], optional :param return_df: Whether to return a dataframe instead of a plot, by default False :type return_df: bool, optional :returns: Plotly heatmap figure or DataFrame if return_df=True :rtype: Union[Figure, pl.LazyFrame] .. py:method:: gains_chart(value: str, *, index: str | None = None, by: str | list[str] | None = None, query: pdstools.utils.types.QUERY | None = None, title: str | None = None, return_df: bool = False) -> Figure | polars.LazyFrame Generate a gains chart showing cumulative distribution of a metric. Creates a gains/lift chart to visualize model response skewness. Shows what percentage of the total value (e.g., responses, positives) is driven by what percentage of models. Useful for identifying if a small number of models drive most of the volume. :param value: Column name containing the metric to compute gains for (e.g., "ResponseCount", "Positives") :type value: str :param index: Column name to normalize by (e.g., population size). If None, uses model count. :type index: str, optional :param by: Column(s) to group by for separate gain curves (e.g., "Channel" or ["Channel", "Direction"]) :type by: str | list[str], optional :param query: Optional query to filter the data before computing gains :type query: QUERY, optional :param title: Chart title. If None, uses "Gains Chart" :type title: str, optional :param return_df: If True, return the gains data instead of the figure :type return_df: bool, default False :returns: Plotly figure showing the gains chart, or LazyFrame if return_df=True :rtype: Figure | pl.LazyFrame .. rubric:: Examples >>> # Single gains curve for response count >>> fig = datamart.plot.gains_chart(value="ResponseCount") >>> # Gains curves by channel for positives >>> fig = datamart.plot.gains_chart( ... value="Positives", ... by=["Channel", "Direction"], ... title="Cumulative Positives by Channel" ... ) .. py:method:: performance_volume_distribution(*, by: str | list[str] | None = None, query: pdstools.utils.types.QUERY | None = None, bin_width: int = 3, title: str | None = None, return_df: bool = False) -> Figure | polars.LazyFrame Generate a performance vs volume distribution chart. Shows how response volume is distributed across different model performance ranges. Helps identify if volume is driven by high-performing or low-performing models. Ideally, most volume should be in the 60-80 AUC range. :param by: Column(s) to group by for separate curves (e.g., "Channel" or ["Channel", "Direction"]) If None, creates a single curve for all data :type by: str | list[str], optional :param query: Optional query to filter the data before analysis :type query: QUERY, optional :param bin_width: Width of performance bins in AUC points (default creates bins of 3: 50-53, 53-56, etc.) :type bin_width: int, default 3 :param title: Chart title. If None, uses "Performance vs Volume" :type title: str, optional :param return_df: If True, return the binned data instead of the figure :type return_df: bool, default False :returns: Plotly figure showing performance distribution, or LazyFrame if return_df=True :rtype: Figure | pl.LazyFrame .. rubric:: Notes Performance is binned from 50-100 using the specified bin_width. The chart shows what percentage of responses fall into each performance bin, grouped by the `by` parameter if provided. .. rubric:: Examples >>> # Single curve for all channels >>> fig = datamart.plot.performance_volume_distribution() >>> # Separate curves per channel >>> fig = datamart.plot.performance_volume_distribution( ... by=["Channel", "Direction"], ... title="Performance Distribution by Channel" ... ) .. py:method:: tree_map(metric: Literal['ResponseCount', 'Positives', 'Performance', 'SuccessRate', 'percentage_without_responses'] = 'Performance', *, by: str = 'Name', query: pdstools.utils.types.QUERY | None = None, return_df: bool = False) Generate a tree map visualization showing hierarchical model metrics. :param metric: The metric to visualize in the tree map, by default "Performance" :type metric: Literal["ResponseCount", "Positives", "Performance", "SuccessRate", "percentage_without_responses"], optional :param by: Column to group by for the tree map hierarchy, by default "Name" :type by: str, optional :param query: Optional query to filter the data, by default None :type query: Optional[QUERY], optional :param return_df: Whether to return a dataframe instead of a plot, by default False :type return_df: bool, optional :returns: Plotly treemap figure or DataFrame if return_df=True :rtype: Union[Figure, pl.LazyFrame] .. py:method:: predictor_count(*, by: str | list[str] = ['EntryType', 'Type'], query: pdstools.utils.types.QUERY | None = None, return_df: bool = False) Generate a box plot showing the distribution of predictor counts by type. :param by: Column(s) to group predictors by, by default ["EntryType", "Type"] :type by: Union[str, list[str]], optional :param query: Optional query to filter the data, by default None :type query: Optional[QUERY], optional :param return_df: Whether to return a dataframe instead of a plot, by default False :type return_df: bool, optional :returns: Plotly box plot figure or DataFrame if return_df=True :rtype: Union[Figure, pl.LazyFrame] .. py:method:: binning_lift(model_id: str, predictor_name: str, *, query: pdstools.utils.types.QUERY | None = None, return_df: bool = False) Generate a binning lift plot for a specific predictor showing propensity lift per bin. :param model_id: The ID of the model containing the predictor :type model_id: str :param predictor_name: Name of the predictor to analyze for lift :type predictor_name: str :param query: Optional query to filter the predictor data, by default None :type query: Optional[QUERY], optional :param return_df: Whether to return a dataframe instead of a plot, by default False :type return_df: bool, optional :returns: Plotly bar chart showing binning lift or DataFrame if return_df=True :rtype: Union[Figure, pl.LazyFrame] .. py:method:: action_overlap(group_col: str | list[str] | polars.Expr = 'Channel', overlap_col='Name', *, show_fraction=True, query: pdstools.utils.types.QUERY | None = None, return_df: bool = False) Generate an overlap matrix heatmap showing shared actions across different groups. :param group_col: Column(s) to group by for overlap analysis, by default "Channel" :type group_col: Union[str, list[str], pl.Expr], optional :param overlap_col: Column containing values to analyze for overlap, by default "Name" :type overlap_col: str, optional :param show_fraction: Whether to show overlap as fraction or absolute count, by default True :type show_fraction: bool, optional :param query: Optional query to filter the data, by default None :type query: Optional[QUERY], optional :param return_df: Whether to return a dataframe instead of a plot, by default False :type return_df: bool, optional :returns: Plotly heatmap showing action overlap or DataFrame if return_df=True :rtype: Union[Figure, pl.LazyFrame] .. py:method:: partitioned_plot(func: collections.abc.Callable, facets: list[dict[str, str | None]], show_plots: bool = True, *args, **kwargs) Execute a plotting function across multiple faceted subsets of data. This method applies a given plotting function to multiple filtered subsets of data, where each subset is defined by the facet conditions. It's useful for generating multiple plots with different filter conditions applied. :param func: The plotting function to execute for each facet :type func: Callable :param facets: list of dictionaries defining filter conditions for each facet :type facets: list[dict[str, Optional[str]]] :param show_plots: Whether to display the plots as they are generated, by default True :type show_plots: bool, optional :param \*args: Additional positional arguments to pass to the plotting function :type \*args: tuple :param \*\*kwargs: Additional keyword arguments to pass to the plotting function :type \*\*kwargs: dict :returns: list of Plotly figures, one for each facet condition :rtype: list[Figure]