`pdstools.plots.plot_base`¶

Module Contents¶

Classes¶

Plots

Base plotting class

Functions¶

plotBinningLift(→ Union[polars.DataFrame, ...)

class Plots¶

Base plotting class

hasModels¶

A flag indicating whether the object has model data.

Type:: bool

hasPredictorBinning¶

A flag indicating whether the object has predictor data.

Type:: bool

hasCombined¶

A flag indicating whether the object has combined data.

Type:: bool

AvailableVisualisations¶

A dataframe with available visualizations and whether they require model data, predictor data, or multiple snapshots.

Type:: pl.DataFrame

import_strategy¶

Whether to import the file fully to memory, or scan the file When data fits into memory, ‘eager’ is typically more efficient However, when data does not fit, the lazy methods typically allow you to still use the data.

Type:: str

property AvailableVisualisations¶

property ApplicableVisualisations¶

plotApplicable()¶

static top_n(df: polars.DataFrame, top_n: int, to_plot: str = 'PerformanceBin', facets: list | None = None)¶

Subsets DataFrame to contain only top_n predictors.

Parameters:

df (pl.DataFrame) – Table to subset
top_n (int) – Number of top predictors
to_plot (str) – Metric to use for comparing predictors
facets (list) – Subsets top_n predictors over facets. Seperate top predictors for each facet

Returns:

Subsetted dataframe

Return type:

pl.DataFrame

_subset_data(table: str, required_columns: set, query: str | Dict[str, list] = None, multi_snapshot: bool = False, last: bool = False, facets: str | list = None, active_only: bool = False, include_cols: list | None = None) → polars.DataFrame | List[str]¶

Retrieves and subsets the data and performs some assertion checks

Parameters:

table (str) – Which table to retrieve from the ADMDatamart object (modelData, predictorData or combinedData)
required_columns (set) – Which columns we want to use for the visualisation Asserts those columns are in the data, and returns only those columns for efficiency By default, the context keys are added as required columns.
query (Union[pl.Expr, str, Dict[str, list]], default = None) – Please refer to pdstools.adm.ADMDatamart._apply_query()
last (bool, default = False) – Whether to subset on just the last known value for each model ID/predictor/bin
facets (Union[str, list], deafult = None) – Please refer to _generateFacets()
active_only (bool, default = False) – Whether to subset on just the active predictors
include_cols (Optional[list]) – Extra columns to include in the subsetting
multi_snapshot (bool)

Returns:

The subsetted dataframe Generated facet column name

Return type:

Union[pl.DataFrame, List[str]]

_generateFacets(df: pdstools.utils.types.any_frame, facets: str | List[str] = None) → list¶

Generates a list of facets based on the given dataframe and facet columns.

Given a string with column names combined with backslash, the function generates that column, adds it to the dataframe and return the new dataframe together with the generated column’s name

Parameters:

df (pl.DataFrame | pl.LazyFrame) – The input dataframe for which the facets are to be generated.
facets (Union[str, list], default = None) – By which columns to facet the plots. If string, facets it by just that one column. If list, facets it by every element of the list. If a string contains a /, it will combine those columns as one facet.

Returns:

DataFrame – The input dataframe with additional facet columns added.
Union[str, list], deafult = None – The generated facets

Return type:

list

Examples

>>> df, facets = _generateFacets(df, "Configuration")
    Creates a plot for each Configuration
>>> df, facets = _generateFacets(df, ["Channel", "Direction"])
    Creates a plot for each Channel and for each Direction as separate facets
>>> df, facets = _generateFacets(df, "Channel/Configuration")
    Creates a plot for each combination of Channel and Configuration

static facettedPlot(facets: list | None, plotFunc: Any, partition: bool = False, *args, **kwargs)¶

Takes care of facetting the plots.

If partition is True, generates a new dataframe for each plot If partition is False, simply gives the facet as the facet argument

In effect, this means that facet = False give a ‘plotly-native’ facet, while facet = True gives a distinct plot for every facet.

Parameters:

facets (Optional[list]) – If there’s no facet supplied, we just return the plot Else, we loop through each facet and create the plot
plotFunc (Any) – The original function to create the plot The plot is simply passed through to this function Along with all arguments
partition (bool, default=False) – If True, generates a new dataframe for each plot If False, simply gives the facet as the facet argument
*args – Any additional arguments, depending on the plotFunc

Keyword Arguments:

order (dict) – The order of categories, for each facet
**kwargs – Any additional keyword arguments, depending on the plotFunc

plotPerformanceSuccessRateBubbleChart(last: bool = True, add_bottom_left_text: bool = True, query: polars.Expr | str | Dict[str, list] | None = None, facets: str | list = None, **kwargs) → plotly.graph_objs.FigureWidget¶

Creates bubble chart similar to ADM OOTB.

Parameters:

last (bool, default = True) – Whether to only look at the last snapshot (recommended)
add_bottom_left_text (bool, default = True) – Whether to display how many models are in the bottom left of the chart In other words, who have no performance and no success rate
query (Optional[Union[pl.Expr, str, Dict[str, list]]]) – Please refer to pdstools.adm.ADMDatamart._apply_query()
facets (Union[str, list], deafult = None) – Please refer to _generateFacets()

Keyword Arguments:

round (int, default = 5) – To how many digits to round the hover data
plotting_engine (str) – ‘plotly’ or a custom plot class
return_df (bool) – If set to True, returns the dataframe instead of the plot Can be useful for debugging or replicating the plots

Return type:

plotly.graph_objs.FigureWidget

Notes

See the docs for the plotly plots (plots_plotly.py) to see further parameters for this plot.

Return type:

go.FigureWidget

Parameters:

last (bool)
add_bottom_left_text (bool)
query (Optional[Union[polars.Expr, str, Dict[str, list]]])
facets (Union[str, list])

plotOverTime(metric: str = 'Performance', by: str = 'ModelID', every: int = '1d', query: polars.Expr | str | Dict[str, list] | None = None, facets: str | list = None, mode: str = 'Diff', **kwargs) → plotly.graph_objs.FigureWidget¶

Plots a given metric over time

Parameters:

metric (str, default = Performance) – The metric to plot over time. One of the following: {ResponseCount, Performance, SuccessRate, Positives, weighted_performance}
by (str, default = ModelID) – What variable to group the data by One of {ModelID, Name}
every (int, default = 1d) – How often to consider the metrics
query (Optional[Union[pl.Expr, str, Dict[str, list]]]) – Please refer to pdstools.adm.ADMDatamart._apply_query()
facets (Union[str, list], deafult = None) – Please refer to _generateFacets()
mode (str, default = Diff) – The plotting mode. Should be one of the following: - ‘Diff’: Plot differences over a specified period. - ‘Cumulative’: Plot time series plot of the values as is.

Keyword Arguments:

plotting_engine (str) – ‘plotly’ or a custom plot class
return_df (bool) – If set to True, returns the dataframe instead of the plot Can be useful for debugging or replicating the plots

Return type:

plotly.graph_objs.FigureWidget

Notes

See the docs for the plotly plots (plots_plotly.py) to see further parameters for this plot.

Return type:

go.FigureWidget

Parameters:

metric (str)
by (str)
every (int)
query (Optional[Union[polars.Expr, str, Dict[str, list]]])
facets (Union[str, list])
mode (str)

plotPropositionSuccessRates(metric: str = 'SuccessRate', by: str = 'Name', show_error: bool = True, top_n=0, query: polars.Expr | str | Dict[str, list] | None = None, facets: str | list = None, **kwargs) → plotly.graph_objs.FigureWidget¶

Plots all latest proposition success rates

Parameters:

metric (str, default = SuccessRate) – Can be changed to plot a different metric
by (str, default = Name) – What variable to group the data by One of {ModelID, Name}
show_error (bool, default = True) – Whether to show error bars in the bar plots
top_n (int, default = 0) – The number of rows to include in the pivoted DataFrame. If set to 0, all rows are included.
query (Optional[Union[pl.Expr, str, Dict[str, list]]]) – Please refer to pdstools.adm.ADMDatamart._apply_query()
facets (Union[str, list], deafult = None) – Please refer to _generateFacets()

Keyword Arguments:

plotting_engine (str) – ‘plotly’ or a custom plot class
return_df (bool) – If set to True, returns the dataframe instead of the plot Can be useful for debugging or replicating the plots

Return type:

plotly.graph_objs.FigureWidget

Notes

See the docs for the plotly plots (plots_plotly.py) to see further parameters for this plot.

Return type:

go.FigureWidget

Parameters:

metric (str)
by (str)
show_error (bool)
query (Optional[Union[polars.Expr, str, Dict[str, list]]])
facets (Union[str, list])

plotScoreDistribution(by: str = 'ModelID', *, show_zero_responses: bool = False, modelids: List | None = None, query: polars.Expr | str | Dict[str, list] | None = None, show_each=False, **kwargs) → plotly.graph_objs.FigureWidget¶

Plots the score distribution, similar to OOTB

Parameters:

by (str, default = Name) – What variable to group the data by One of {ModelID, Name}
show_zero_responses (bool)
modelids (Optional[List])
query (Optional[Union[polars.Expr, str, Dict[str, list]]])

Keyword Arguments:

show_zero_responses (bool, default = False) – Whether to include bins with no responses at all
modelids (Optional[List], default = None) – Models to plot for. If multiple ids are given, returns a list of Plots for each model
query (Optional[Union[pl.Expr, str, Dict[str, list]]]) – Please refer to pdstools.adm.ADMDatamart._apply_query()
show_each (bool) – Whether to show each file when multiple facets are used
plotting_engine (str) – ‘plotly’ or a custom plot class
return_df (bool) – If set to True, returns the dataframe instead of the plot Can be useful for debugging or replicating the plots

Return type:

plotly.graph_objs.FigureWidget

Notes

See the docs for the plotly plots (plots_plotly.py) to see further parameters for this plot.

Return type:

go.FigureWidget

Parameters:

by (str)
show_zero_responses (bool)
modelids (Optional[List])
query (Optional[Union[polars.Expr, str, Dict[str, list]]])

plotPredictorBinning(predictors: list = None, modelids: list = None, show_each=False, query: polars.Expr | str | Dict[str, list] | None = None, **kwargs) → plotly.graph_objs.FigureWidget¶

Plots the binning of given predictors

Parameters:

predictors (list, default = None) – An optional list of predictors to plot the bins for Useful for plotting one or more variables over multiple models
modelids (list, default = None) – An optional list of model ids to plot the predictors for
query (Optional[Union[pl.Expr, str, Dict[str, list]]]) – Please refer to pdstools.adm.ADMDatamart._apply_query()

Keyword Arguments:

plotting_engine (str) – ‘plotly’ or a custom plot class
return_df (bool) – If set to True, returns the dataframe instead of the plot Can be useful for debugging or replicating the plots

Return type:

plotly.graph_objs.FigureWidget

Notes

See the docs for the plotly plots (plots_plotly.py) to see further parameters for this plot.

Return type:

go.FigureWidget

Parameters:

predictors (list)
modelids (list)
query (Optional[Union[polars.Expr, str, Dict[str, list]]])

plotPredictorPerformance(top_n: int = 0, active_only: bool = False, to_plot='Performance', query: polars.Expr | str | Dict[str, list] | None = None, facets: str | list = None, **kwargs) → plotly.graph_objs.FigureWidget¶

Plots a bar chart of the performance of the predictors

By default, this plot shows the performance over all models Use the querying functionality to drill down into a more specific subset. Picks top n predictors with highest weighted average Performance accross models and then sorts the predictors according to the median value.

Parameters:

top_n (int, default = 0) – How many of the top predictors to show in the plot
active_only (bool, default = False) – Whether to only plot active predictors
to_plot (str, default = Performance) – Metric to compare predictors
query (Optional[Union[pl.Expr, str, Dict[str, list]]]) – Please refer to pdstools.adm.ADMDatamart._apply_query()
facets (Union[str, list], deafult = None) – Please refer to _generateFacets()

Keyword Arguments:

plotting_engine (str) – ‘plotly’ or a custom plot class
return_df (bool) – If set to True, returns the dataframe instead of the plot Can be useful for debugging or replicating the plots

Return type:

plotly.graph_objs.FigureWidget

Notes

See the docs for the plotly plots (plots_plotly.py) to see further parameters for this plot.

Return type:

go.FigureWidget

Parameters:

top_n (int)
active_only (bool)
query (Optional[Union[polars.Expr, str, Dict[str, list]]])
facets (Union[str, list])

plotPredictorCategoryPerformance(active_only: bool = False, to_plot='Performance', query: polars.Expr | str | Dict[str, list] | None = None, facets: str | list = None, **kwargs) → plotly.graph_objs.FigureWidget¶

Plots a bar chart of the performance of the predictor categories

By default, this plot shows the performance over all models Use the querying functionality to drill down into a more specific subset

Parameters:

active_only (bool, default = False) – Whether to only plot active predictors
to_plot (str, default = Performance) – Metric to compare predictor categories
query (Optional[Union[pl.Expr, str, Dict[str, list]]]) – Please refer to pdstools.adm.ADMDatamart._apply_query()
facets (Union[str, list], deafult = None) – Please refer to _generateFacets()

Keyword Arguments:

plotting_engine (str) – ‘plotly’ or a custom plot class
separate (bool) – If set to true, dataset is subsetted using the facet column, creating seperate plots
return_df (bool) – If set to True, returns the dataframe instead of the plot Can be useful for debugging or replicating the plots

Return type:

plotly.graph_objs.FigureWidget

Notes

See the docs for the plotly plots (plots_plotly.py) to see further parameters for this plot.

Return type:

go.FigureWidget

Parameters:

active_only (bool)
query (Optional[Union[polars.Expr, str, Dict[str, list]]])
facets (Union[str, list])

plotPredictorContribution(by: str = 'Configuration', query: polars.Expr | str | Dict[str, list] | None = None, **kwargs) → plotly.graph_objs.FigureWidget¶

Plots the contribution of each predictor across a group

Parameters:

by (str, default = Configuration) – The column to group the bars with
query (Optional[Union[pl.Expr, str, Dict[str, list]]]) – Please refer to pdstools.adm.ADMDatamart._apply_query()

Keyword Arguments:

predictorCategorization (pl.Expr) – An optional override for the predictor categorization function
plotting_engine (str) – This chart is only supported in plotly
return_df (bool) – If set to True, returns the dataframe instead of the plot Can be useful for debugging or replicating the plots

Return type:

plotly.graph_objs.FigureWidget

Notes

See the docs for the plotly plots (plots_plotly.py). Plotly has an additional post_plot function defining some more actions, such as writing to html automatically or displaying figures while facetting.

Return type:

go.FigureWidget

Parameters:

by (str)
query (Optional[Union[polars.Expr, str, Dict[str, list]]])

plotPredictorPerformanceHeatmap(top_n: int = 20, by='Name', active_only: bool = False, query: polars.Expr | str | Dict[str, list] | None = None, facets: list = None, **kwargs) → plotly.graph_objs.FigureWidget¶

Plots heatmap of the performance of the predictors

By default, this plot shows the performance over all models Use the querying functionality to drill down into a more specific subset

Parameters:

top_n (int, default = 0) – How many of the top predictors to show in the plot
by (str, default = Name) – The column to use at the x axis of the heatmap
active_only (bool, default = False) – Whether to only plot active predictors
query (Optional[Union[pl.Expr, str, Dict[str, list]]]) – Please refer to pdstools.adm.ADMDatamart._apply_query()
facets (Union[str, list], deafult = None) – Please refer to _generateFacets()

Keyword Arguments:

plotting_engine (str) – ‘plotly’ or a custom plot class
return_df (bool) – If set to True, returns the dataframe instead of the plot Can be useful for debugging or replicating the plots

Return type:

plotly.graph_objs.FigureWidget

Notes

See the docs for the plotly plots (plots_plotly.py) to see further parameters for this plot.

Return type:

go.FigureWidget

Parameters:

top_n (int)
active_only (bool)
query (Optional[Union[polars.Expr, str, Dict[str, list]]])
facets (list)

plotResponseGain(by: str = 'Channel', query: polars.Expr | str | Dict[str, list] | None = None, facets=None, **kwargs) → plotly.graph_objs.FigureWidget¶

Plots the cumulative response per model

Parameters:

by (str, default = Channel) – The column by which to calculate response gain Default is Channel, to see the response/gain chart per channel
query (Optional[Union[pl.Expr, str, Dict[str, list]]]) – Please refer to pdstools.adm.ADMDatamart._apply_query()
facets (Union[str, list], deafult = None) – Please refer to _generateFacets()

Keyword Arguments:

plotting_engine (str) – This chart is only supported in plotly
return_df (bool) – If set to True, returns the dataframe instead of the plot Can be useful for debugging or replicating the plots

Return type:

plotly.graph_objs.FigureWidget

Notes

Return type:

go.FigureWidget

Parameters:

by (str)
query (Optional[Union[polars.Expr, str, Dict[str, list]]])

plotModelsByPositives(by: str = 'Channel', query: polars.Expr | str | Dict[str, list] | None = None, facets=None, **kwargs) → plotly.graph_objs.FigureWidget¶

Plots the percentage of models vs the number of positive responses

Parameters:

by (str, default = Channel) – The column to calculate the model percentage by
query (Optional[Union[pl.Expr, str, Dict[str, list]]]) – Please refer to pdstools.adm.ADMDatamart._apply_query()
facets (Union[str, list], deafult = None) – Please refer to _generateFacets()

Keyword Arguments:

plotting_engine (str) – This chart is only supported in plotly
return_df (bool) – If set to True, returns the dataframe instead of the plot Can be useful for debugging or replicating the plots

Return type:

plotly.graph_objs.FigureWidget

Notes

Return type:

go.FigureWidget

Parameters:

by (str)
query (Optional[Union[polars.Expr, str, Dict[str, list]]])

plotTreeMap(color_var: str = 'performance_weighted', by: str = 'ModelID', value_in_text: bool = True, midpoint: float | None = None, query: polars.Expr | str | Dict[str, list] | None = None, **kwargs) → plotly.graph_objs.FigureWidget¶

Plots a treemap to view performance over multiple context keys

Parameters:

color (str, default = performance_weighted) – The column to set as the color of the squares One out of: {responsecount, responsecount_log, positives, positives_log, percentage_without_responses, performance_weighted, successrate}
by (str, default = Channel) – The column to use as the size of the squares
value_in_text (bool, default = True) – Whether to print the values of the swuares in the squares
midpoint (Optional[float]) – A parameter to assert more control over the color distribution Set near 0 to give lower values a ‘higher’ color Set near 1 to give higher values a ‘lower’ color Necessary for, for example, Success Rate, where rates lie very far apart If not supplied in such cases, there is no difference in the color between low values such as 0.001 and 0.1, so midpoint should be set low
query (Optional[Union[pl.Expr, str, Dict[str, list]]]) – Please refer to pdstools.adm.ADMDatamart._apply_query()
facets (Union[str, list], deafult = None) – Please refer to _generateFacets()
color_var (str)

Keyword Arguments:

colorscale (list) – Give a list of hex values to override the default colors Should consist of three colors: ‘low’, ‘neutral’ and ‘high’
plotting_engine (str) – This chart is only supported in plotly
return_df (bool) – If set to True, returns the dataframe instead of the plot Can be useful for debugging or replicating the plots

Return type:

plotly.graph_objs.FigureWidget

Notes

Return type:

go.FigureWidget

Parameters:

color_var (str)
by (str)
value_in_text (bool)
midpoint (Optional[float])
query (Optional[Union[polars.Expr, str, Dict[str, list]]])

plotPredictorCount(facets: str | list, query: polars.Expr | str | Dict[str, list] | None = None, by: str = 'Type', **kwargs)¶

Parameters:

facets (Union[str, list])
query (Optional[Union[polars.Expr, str, Dict[str, list]]])
by (str)

plotBinningLift(binning, col_facet=None, row_facet=None, custom_data=['PredictorName', 'BinSymbol'], return_df=False) → polars.DataFrame | plotly.graph_objects.Figure¶

Return type:: Union[polars.DataFrame, plotly.graph_objects.Figure]

pdstools.plots.plot_base¶

Module Contents¶

Classes¶

Functions¶

`pdstools.plots.plot_base`¶