pdstools.plots.plot_base

Module Contents

Classes

Plots

Base plotting class

Functions

plotBinningLift(→ Union[polars.DataFrame, ...)

class Plots

Base plotting class

hasModels

A flag indicating whether the object has model data.

Type:

bool

hasPredictorBinning

A flag indicating whether the object has predictor data.

Type:

bool

hasCombined

A flag indicating whether the object has combined data.

Type:

bool

AvailableVisualisations

A dataframe with available visualizations and whether they require model data, predictor data, or multiple snapshots.

Type:

pl.DataFrame

import_strategy

Whether to import the file fully to memory, or scan the file When data fits into memory, ‘eager’ is typically more efficient However, when data does not fit, the lazy methods typically allow you to still use the data.

Type:

str

property AvailableVisualisations
property ApplicableVisualisations
plotApplicable()
static top_n(df: polars.DataFrame, top_n: int, to_plot: str = 'PerformanceBin', facets: list | None = None)

Subsets DataFrame to contain only top_n predictors.

Parameters:
  • df (pl.DataFrame) – Table to subset

  • top_n (int) – Number of top predictors

  • to_plot (str) – Metric to use for comparing predictors

  • facets (list) – Subsets top_n predictors over facets. Seperate top predictors for each facet

Returns:

Subsetted dataframe

Return type:

pl.DataFrame

_subset_data(table: str, required_columns: set, query: str | Dict[str, list] = None, multi_snapshot: bool = False, last: bool = False, facets: str | list = None, active_only: bool = False, include_cols: list | None = None) polars.DataFrame | List[str]

Retrieves and subsets the data and performs some assertion checks

Parameters:
  • table (str) – Which table to retrieve from the ADMDatamart object (modelData, predictorData or combinedData)

  • required_columns (set) – Which columns we want to use for the visualisation Asserts those columns are in the data, and returns only those columns for efficiency By default, the context keys are added as required columns.

  • query (Union[pl.Expr, str, Dict[str, list]], default = None) – Please refer to pdstools.adm.ADMDatamart._apply_query()

  • last (bool, default = False) – Whether to subset on just the last known value for each model ID/predictor/bin

  • facets (Union[str, list], deafult = None) – Please refer to _generateFacets()

  • active_only (bool, default = False) – Whether to subset on just the active predictors

  • include_cols (Optional[list]) – Extra columns to include in the subsetting

  • multi_snapshot (bool)

Returns:

The subsetted dataframe Generated facet column name

Return type:

Union[pl.DataFrame, List[str]]

_generateFacets(df: pdstools.utils.types.any_frame, facets: str | List[str] = None) list

Generates a list of facets based on the given dataframe and facet columns.

Given a string with column names combined with backslash, the function generates that column, adds it to the dataframe and return the new dataframe together with the generated column’s name

Parameters:
  • df (pl.DataFrame | pl.LazyFrame) – The input dataframe for which the facets are to be generated.

  • facets (Union[str, list], default = None) – By which columns to facet the plots. If string, facets it by just that one column. If list, facets it by every element of the list. If a string contains a /, it will combine those columns as one facet.

Returns:

  • DataFrame – The input dataframe with additional facet columns added.

  • Union[str, list], deafult = None – The generated facets

Return type:

list

Examples

>>> df, facets = _generateFacets(df, "Configuration")
    Creates a plot for each Configuration
>>> df, facets = _generateFacets(df, ["Channel", "Direction"])
    Creates a plot for each Channel and for each Direction as separate facets
>>> df, facets = _generateFacets(df, "Channel/Configuration")
    Creates a plot for each combination of Channel and Configuration
static facettedPlot(facets: list | None, plotFunc: Any, partition: bool = False, *args, **kwargs)

Takes care of facetting the plots.

If partition is True, generates a new dataframe for each plot If partition is False, simply gives the facet as the facet argument

In effect, this means that facet = False give a ‘plotly-native’ facet, while facet = True gives a distinct plot for every facet.

Parameters:
  • facets (Optional[list]) – If there’s no facet supplied, we just return the plot Else, we loop through each facet and create the plot

  • plotFunc (Any) – The original function to create the plot The plot is simply passed through to this function Along with all arguments

  • partition (bool, default=False) – If True, generates a new dataframe for each plot If False, simply gives the facet as the facet argument

  • *args – Any additional arguments, depending on the plotFunc

Keyword Arguments:
  • order (dict) – The order of categories, for each facet

  • **kwargs – Any additional keyword arguments, depending on the plotFunc

plotPerformanceSuccessRateBubbleChart(last: bool = True, add_bottom_left_text: bool = True, query: polars.Expr | str | Dict[str, list] | None = None, facets: str | list = None, **kwargs) plotly.graph_objs.FigureWidget

Creates bubble chart similar to ADM OOTB.

Parameters:
  • last (bool, default = True) – Whether to only look at the last snapshot (recommended)

  • add_bottom_left_text (bool, default = True) – Whether to display how many models are in the bottom left of the chart In other words, who have no performance and no success rate

  • query (Optional[Union[pl.Expr, str, Dict[str, list]]]) – Please refer to pdstools.adm.ADMDatamart._apply_query()

  • facets (Union[str, list], deafult = None) – Please refer to _generateFacets()

Keyword Arguments:
  • round (int, default = 5) – To how many digits to round the hover data

  • plotting_engine (str) – ‘plotly’ or a custom plot class

  • return_df (bool) – If set to True, returns the dataframe instead of the plot Can be useful for debugging or replicating the plots

Return type:

plotly.graph_objs.FigureWidget

Notes

See the docs for the plotly plots (plots_plotly.py) to see further parameters for this plot.

Return type:

go.FigureWidget

Parameters:
  • last (bool)

  • add_bottom_left_text (bool)

  • query (Optional[Union[polars.Expr, str, Dict[str, list]]])

  • facets (Union[str, list])

plotOverTime(metric: str = 'Performance', by: str = 'ModelID', every: int = '1d', query: polars.Expr | str | Dict[str, list] | None = None, facets: str | list = None, mode: str = 'Diff', **kwargs) plotly.graph_objs.FigureWidget

Plots a given metric over time

Parameters:
  • metric (str, default = Performance) – The metric to plot over time. One of the following: {ResponseCount, Performance, SuccessRate, Positives, weighted_performance}

  • by (str, default = ModelID) – What variable to group the data by One of {ModelID, Name}

  • every (int, default = 1d) – How often to consider the metrics

  • query (Optional[Union[pl.Expr, str, Dict[str, list]]]) – Please refer to pdstools.adm.ADMDatamart._apply_query()

  • facets (Union[str, list], deafult = None) – Please refer to _generateFacets()

  • mode (str, default = Diff) – The plotting mode. Should be one of the following: - ‘Diff’: Plot differences over a specified period. - ‘Cumulative’: Plot time series plot of the values as is.

Keyword Arguments:
  • plotting_engine (str) – ‘plotly’ or a custom plot class

  • return_df (bool) – If set to True, returns the dataframe instead of the plot Can be useful for debugging or replicating the plots

Return type:

plotly.graph_objs.FigureWidget

Notes

See the docs for the plotly plots (plots_plotly.py) to see further parameters for this plot.

Return type:

go.FigureWidget

Parameters:
  • metric (str)

  • by (str)

  • every (int)

  • query (Optional[Union[polars.Expr, str, Dict[str, list]]])

  • facets (Union[str, list])

  • mode (str)

plotPropositionSuccessRates(metric: str = 'SuccessRate', by: str = 'Name', show_error: bool = True, top_n=0, query: polars.Expr | str | Dict[str, list] | None = None, facets: str | list = None, **kwargs) plotly.graph_objs.FigureWidget

Plots all latest proposition success rates

Parameters:
  • metric (str, default = SuccessRate) – Can be changed to plot a different metric

  • by (str, default = Name) – What variable to group the data by One of {ModelID, Name}

  • show_error (bool, default = True) – Whether to show error bars in the bar plots

  • top_n (int, default = 0) – The number of rows to include in the pivoted DataFrame. If set to 0, all rows are included.

  • query (Optional[Union[pl.Expr, str, Dict[str, list]]]) – Please refer to pdstools.adm.ADMDatamart._apply_query()

  • facets (Union[str, list], deafult = None) – Please refer to _generateFacets()

Keyword Arguments:
  • plotting_engine (str) – ‘plotly’ or a custom plot class

  • return_df (bool) – If set to True, returns the dataframe instead of the plot Can be useful for debugging or replicating the plots

Return type:

plotly.graph_objs.FigureWidget

Notes

See the docs for the plotly plots (plots_plotly.py) to see further parameters for this plot.

Return type:

go.FigureWidget

Parameters:
  • metric (str)

  • by (str)

  • show_error (bool)

  • query (Optional[Union[polars.Expr, str, Dict[str, list]]])

  • facets (Union[str, list])

plotScoreDistribution(by: str = 'ModelID', *, show_zero_responses: bool = False, modelids: List | None = None, query: polars.Expr | str | Dict[str, list] | None = None, show_each=False, **kwargs) plotly.graph_objs.FigureWidget

Plots the score distribution, similar to OOTB

Parameters:
  • by (str, default = Name) – What variable to group the data by One of {ModelID, Name}

  • show_zero_responses (bool)

  • modelids (Optional[List])

  • query (Optional[Union[polars.Expr, str, Dict[str, list]]])

Keyword Arguments:
  • show_zero_responses (bool, default = False) – Whether to include bins with no responses at all

  • modelids (Optional[List], default = None) – Models to plot for. If multiple ids are given, returns a list of Plots for each model

  • query (Optional[Union[pl.Expr, str, Dict[str, list]]]) – Please refer to pdstools.adm.ADMDatamart._apply_query()

  • show_each (bool) – Whether to show each file when multiple facets are used

  • plotting_engine (str) – ‘plotly’ or a custom plot class

  • return_df (bool) – If set to True, returns the dataframe instead of the plot Can be useful for debugging or replicating the plots

Return type:

plotly.graph_objs.FigureWidget

Notes

See the docs for the plotly plots (plots_plotly.py) to see further parameters for this plot.

Return type:

go.FigureWidget

Parameters:
  • by (str)

  • show_zero_responses (bool)

  • modelids (Optional[List])

  • query (Optional[Union[polars.Expr, str, Dict[str, list]]])

plotPredictorBinning(predictors: list = None, modelids: list = None, show_each=False, query: polars.Expr | str | Dict[str, list] | None = None, **kwargs) plotly.graph_objs.FigureWidget

Plots the binning of given predictors

Parameters:
  • predictors (list, default = None) – An optional list of predictors to plot the bins for Useful for plotting one or more variables over multiple models

  • modelids (list, default = None) – An optional list of model ids to plot the predictors for

  • query (Optional[Union[pl.Expr, str, Dict[str, list]]]) – Please refer to pdstools.adm.ADMDatamart._apply_query()

Keyword Arguments:
  • plotting_engine (str) – ‘plotly’ or a custom plot class

  • return_df (bool) – If set to True, returns the dataframe instead of the plot Can be useful for debugging or replicating the plots

Return type:

plotly.graph_objs.FigureWidget

Notes

See the docs for the plotly plots (plots_plotly.py) to see further parameters for this plot.

Return type:

go.FigureWidget

Parameters:
  • predictors (list)

  • modelids (list)

  • query (Optional[Union[polars.Expr, str, Dict[str, list]]])

plotPredictorPerformance(top_n: int = 0, active_only: bool = False, to_plot='Performance', query: polars.Expr | str | Dict[str, list] | None = None, facets: str | list = None, **kwargs) plotly.graph_objs.FigureWidget

Plots a bar chart of the performance of the predictors

By default, this plot shows the performance over all models Use the querying functionality to drill down into a more specific subset. Picks top n predictors with highest weighted average Performance accross models and then sorts the predictors according to the median value.

Parameters:
  • top_n (int, default = 0) – How many of the top predictors to show in the plot

  • active_only (bool, default = False) – Whether to only plot active predictors

  • to_plot (str, default = Performance) – Metric to compare predictors

  • query (Optional[Union[pl.Expr, str, Dict[str, list]]]) – Please refer to pdstools.adm.ADMDatamart._apply_query()

  • facets (Union[str, list], deafult = None) – Please refer to _generateFacets()

Keyword Arguments:
  • plotting_engine (str) – ‘plotly’ or a custom plot class

  • return_df (bool) – If set to True, returns the dataframe instead of the plot Can be useful for debugging or replicating the plots

Return type:

plotly.graph_objs.FigureWidget

Notes

See the docs for the plotly plots (plots_plotly.py) to see further parameters for this plot.

Return type:

go.FigureWidget

Parameters:
  • top_n (int)

  • active_only (bool)

  • query (Optional[Union[polars.Expr, str, Dict[str, list]]])

  • facets (Union[str, list])

plotPredictorCategoryPerformance(active_only: bool = False, to_plot='Performance', query: polars.Expr | str | Dict[str, list] | None = None, facets: str | list = None, **kwargs) plotly.graph_objs.FigureWidget

Plots a bar chart of the performance of the predictor categories

By default, this plot shows the performance over all models Use the querying functionality to drill down into a more specific subset

Parameters:
  • active_only (bool, default = False) – Whether to only plot active predictors

  • to_plot (str, default = Performance) – Metric to compare predictor categories

  • query (Optional[Union[pl.Expr, str, Dict[str, list]]]) – Please refer to pdstools.adm.ADMDatamart._apply_query()

  • facets (Union[str, list], deafult = None) – Please refer to _generateFacets()

Keyword Arguments:
  • plotting_engine (str) – ‘plotly’ or a custom plot class

  • separate (bool) – If set to true, dataset is subsetted using the facet column, creating seperate plots

  • return_df (bool) – If set to True, returns the dataframe instead of the plot Can be useful for debugging or replicating the plots

Return type:

plotly.graph_objs.FigureWidget

Notes

See the docs for the plotly plots (plots_plotly.py) to see further parameters for this plot.

Return type:

go.FigureWidget

Parameters:
  • active_only (bool)

  • query (Optional[Union[polars.Expr, str, Dict[str, list]]])

  • facets (Union[str, list])

plotPredictorContribution(by: str = 'Configuration', query: polars.Expr | str | Dict[str, list] | None = None, **kwargs) plotly.graph_objs.FigureWidget

Plots the contribution of each predictor across a group

Parameters:
  • by (str, default = Configuration) – The column to group the bars with

  • query (Optional[Union[pl.Expr, str, Dict[str, list]]]) – Please refer to pdstools.adm.ADMDatamart._apply_query()

Keyword Arguments:
  • predictorCategorization (pl.Expr) – An optional override for the predictor categorization function

  • plotting_engine (str) – This chart is only supported in plotly

  • return_df (bool) – If set to True, returns the dataframe instead of the plot Can be useful for debugging or replicating the plots

Return type:

plotly.graph_objs.FigureWidget

Notes

See the docs for the plotly plots (plots_plotly.py). Plotly has an additional post_plot function defining some more actions, such as writing to html automatically or displaying figures while facetting.

Return type:

go.FigureWidget

Parameters:
  • by (str)

  • query (Optional[Union[polars.Expr, str, Dict[str, list]]])

plotPredictorPerformanceHeatmap(top_n: int = 20, by='Name', active_only: bool = False, query: polars.Expr | str | Dict[str, list] | None = None, facets: list = None, **kwargs) plotly.graph_objs.FigureWidget

Plots heatmap of the performance of the predictors

By default, this plot shows the performance over all models Use the querying functionality to drill down into a more specific subset

Parameters:
  • top_n (int, default = 0) – How many of the top predictors to show in the plot

  • by (str, default = Name) – The column to use at the x axis of the heatmap

  • active_only (bool, default = False) – Whether to only plot active predictors

  • query (Optional[Union[pl.Expr, str, Dict[str, list]]]) – Please refer to pdstools.adm.ADMDatamart._apply_query()

  • facets (Union[str, list], deafult = None) – Please refer to _generateFacets()

Keyword Arguments:
  • plotting_engine (str) – ‘plotly’ or a custom plot class

  • return_df (bool) – If set to True, returns the dataframe instead of the plot Can be useful for debugging or replicating the plots

Return type:

plotly.graph_objs.FigureWidget

Notes

See the docs for the plotly plots (plots_plotly.py) to see further parameters for this plot.

Return type:

go.FigureWidget

Parameters:
  • top_n (int)

  • active_only (bool)

  • query (Optional[Union[polars.Expr, str, Dict[str, list]]])

  • facets (list)

plotResponseGain(by: str = 'Channel', query: polars.Expr | str | Dict[str, list] | None = None, facets=None, **kwargs) plotly.graph_objs.FigureWidget

Plots the cumulative response per model

Parameters:
  • by (str, default = Channel) – The column by which to calculate response gain Default is Channel, to see the response/gain chart per channel

  • query (Optional[Union[pl.Expr, str, Dict[str, list]]]) – Please refer to pdstools.adm.ADMDatamart._apply_query()

  • facets (Union[str, list], deafult = None) – Please refer to _generateFacets()

Keyword Arguments:
  • plotting_engine (str) – This chart is only supported in plotly

  • return_df (bool) – If set to True, returns the dataframe instead of the plot Can be useful for debugging or replicating the plots

Return type:

plotly.graph_objs.FigureWidget

Notes

See the docs for the plotly plots (plots_plotly.py). Plotly has an additional post_plot function defining some more actions, such as writing to html automatically or displaying figures while facetting.

Return type:

go.FigureWidget

Parameters:
  • by (str)

  • query (Optional[Union[polars.Expr, str, Dict[str, list]]])

plotModelsByPositives(by: str = 'Channel', query: polars.Expr | str | Dict[str, list] | None = None, facets=None, **kwargs) plotly.graph_objs.FigureWidget

Plots the percentage of models vs the number of positive responses

Parameters:
  • by (str, default = Channel) – The column to calculate the model percentage by

  • query (Optional[Union[pl.Expr, str, Dict[str, list]]]) – Please refer to pdstools.adm.ADMDatamart._apply_query()

  • facets (Union[str, list], deafult = None) – Please refer to _generateFacets()

Keyword Arguments:
  • plotting_engine (str) – This chart is only supported in plotly

  • return_df (bool) – If set to True, returns the dataframe instead of the plot Can be useful for debugging or replicating the plots

Return type:

plotly.graph_objs.FigureWidget

Notes

See the docs for the plotly plots (plots_plotly.py). Plotly has an additional post_plot function defining some more actions, such as writing to html automatically or displaying figures while facetting.

Return type:

go.FigureWidget

Parameters:
  • by (str)

  • query (Optional[Union[polars.Expr, str, Dict[str, list]]])

plotTreeMap(color_var: str = 'performance_weighted', by: str = 'ModelID', value_in_text: bool = True, midpoint: float | None = None, query: polars.Expr | str | Dict[str, list] | None = None, **kwargs) plotly.graph_objs.FigureWidget

Plots a treemap to view performance over multiple context keys

Parameters:
  • color (str, default = performance_weighted) – The column to set as the color of the squares One out of: {responsecount, responsecount_log, positives, positives_log, percentage_without_responses, performance_weighted, successrate}

  • by (str, default = Channel) – The column to use as the size of the squares

  • value_in_text (bool, default = True) – Whether to print the values of the swuares in the squares

  • midpoint (Optional[float]) – A parameter to assert more control over the color distribution Set near 0 to give lower values a ‘higher’ color Set near 1 to give higher values a ‘lower’ color Necessary for, for example, Success Rate, where rates lie very far apart If not supplied in such cases, there is no difference in the color between low values such as 0.001 and 0.1, so midpoint should be set low

  • query (Optional[Union[pl.Expr, str, Dict[str, list]]]) – Please refer to pdstools.adm.ADMDatamart._apply_query()

  • facets (Union[str, list], deafult = None) – Please refer to _generateFacets()

  • color_var (str)

Keyword Arguments:
  • colorscale (list) – Give a list of hex values to override the default colors Should consist of three colors: ‘low’, ‘neutral’ and ‘high’

  • plotting_engine (str) – This chart is only supported in plotly

  • return_df (bool) – If set to True, returns the dataframe instead of the plot Can be useful for debugging or replicating the plots

Return type:

plotly.graph_objs.FigureWidget

Notes

See the docs for the plotly plots (plots_plotly.py). Plotly has an additional post_plot function defining some more actions, such as writing to html automatically or displaying figures while facetting.

Return type:

go.FigureWidget

Parameters:
  • color_var (str)

  • by (str)

  • value_in_text (bool)

  • midpoint (Optional[float])

  • query (Optional[Union[polars.Expr, str, Dict[str, list]]])

plotPredictorCount(facets: str | list, query: polars.Expr | str | Dict[str, list] | None = None, by: str = 'Type', **kwargs)
Parameters:
  • facets (Union[str, list])

  • query (Optional[Union[polars.Expr, str, Dict[str, list]]])

  • by (str)

plotBinningLift(binning, col_facet=None, row_facet=None, custom_data=['PredictorName', 'BinSymbol'], return_df=False) polars.DataFrame | plotly.graph_objects.Figure
Return type:

Union[polars.DataFrame, plotly.graph_objects.Figure]