pdstools.plots.plot_base
¶
Module Contents¶
Classes¶
Base plotting class |
Functions¶
|
- class Plots¶
Base plotting class
- hasModels¶
A flag indicating whether the object has model data.
- Type:
bool
- hasPredictorBinning¶
A flag indicating whether the object has predictor data.
- Type:
bool
- hasCombined¶
A flag indicating whether the object has combined data.
- Type:
bool
- AvailableVisualisations¶
A dataframe with available visualizations and whether they require model data, predictor data, or multiple snapshots.
- Type:
pl.DataFrame
- import_strategy¶
Whether to import the file fully to memory, or scan the file When data fits into memory, ‘eager’ is typically more efficient However, when data does not fit, the lazy methods typically allow you to still use the data.
- Type:
str
- property AvailableVisualisations¶
- property ApplicableVisualisations¶
- plotApplicable()¶
- static top_n(df: polars.DataFrame, top_n: int, to_plot: str = 'PerformanceBin', facets: list | None = None)¶
Subsets DataFrame to contain only top_n predictors.
- Parameters:
df (pl.DataFrame) – Table to subset
top_n (int) – Number of top predictors
to_plot (str) – Metric to use for comparing predictors
facets (list) – Subsets top_n predictors over facets. Seperate top predictors for each facet
- Returns:
Subsetted dataframe
- Return type:
pl.DataFrame
- _subset_data(table: str, required_columns: set, query: str | Dict[str, list] = None, multi_snapshot: bool = False, last: bool = False, facets: str | list = None, active_only: bool = False, include_cols: list | None = None) polars.DataFrame | List[str] ¶
Retrieves and subsets the data and performs some assertion checks
- Parameters:
table (str) – Which table to retrieve from the ADMDatamart object (modelData, predictorData or combinedData)
required_columns (set) – Which columns we want to use for the visualisation Asserts those columns are in the data, and returns only those columns for efficiency By default, the context keys are added as required columns.
query (Union[pl.Expr, str, Dict[str, list]], default = None) – Please refer to
pdstools.adm.ADMDatamart._apply_query()
last (bool, default = False) – Whether to subset on just the last known value for each model ID/predictor/bin
facets (Union[str, list], deafult = None) – Please refer to
_generateFacets()
active_only (bool, default = False) – Whether to subset on just the active predictors
include_cols (Optional[list]) – Extra columns to include in the subsetting
multi_snapshot (bool)
- Returns:
The subsetted dataframe Generated facet column name
- Return type:
Union[pl.DataFrame, List[str]]
- _generateFacets(df: pdstools.utils.types.any_frame, facets: str | List[str] = None) list ¶
Generates a list of facets based on the given dataframe and facet columns.
Given a string with column names combined with backslash, the function generates that column, adds it to the dataframe and return the new dataframe together with the generated column’s name
- Parameters:
df (pl.DataFrame | pl.LazyFrame) – The input dataframe for which the facets are to be generated.
facets (Union[str, list], default = None) – By which columns to facet the plots. If string, facets it by just that one column. If list, facets it by every element of the list. If a string contains a /, it will combine those columns as one facet.
- Returns:
DataFrame – The input dataframe with additional facet columns added.
Union[str, list], deafult = None – The generated facets
- Return type:
list
Examples
>>> df, facets = _generateFacets(df, "Configuration") Creates a plot for each Configuration >>> df, facets = _generateFacets(df, ["Channel", "Direction"]) Creates a plot for each Channel and for each Direction as separate facets >>> df, facets = _generateFacets(df, "Channel/Configuration") Creates a plot for each combination of Channel and Configuration
- static facettedPlot(facets: list | None, plotFunc: Any, partition: bool = False, *args, **kwargs)¶
Takes care of facetting the plots.
If partition is True, generates a new dataframe for each plot If partition is False, simply gives the facet as the facet argument
In effect, this means that facet = False give a ‘plotly-native’ facet, while facet = True gives a distinct plot for every facet.
- Parameters:
facets (Optional[list]) – If there’s no facet supplied, we just return the plot Else, we loop through each facet and create the plot
plotFunc (Any) – The original function to create the plot The plot is simply passed through to this function Along with all arguments
partition (bool, default=False) – If True, generates a new dataframe for each plot If False, simply gives the facet as the facet argument
*args – Any additional arguments, depending on the plotFunc
- Keyword Arguments:
order (dict) – The order of categories, for each facet
**kwargs – Any additional keyword arguments, depending on the plotFunc
- plotPerformanceSuccessRateBubbleChart(last: bool = True, add_bottom_left_text: bool = True, query: polars.Expr | str | Dict[str, list] | None = None, facets: str | list = None, **kwargs) plotly.graph_objs.FigureWidget ¶
Creates bubble chart similar to ADM OOTB.
- Parameters:
last (bool, default = True) – Whether to only look at the last snapshot (recommended)
add_bottom_left_text (bool, default = True) – Whether to display how many models are in the bottom left of the chart In other words, who have no performance and no success rate
query (Optional[Union[pl.Expr, str, Dict[str, list]]]) – Please refer to
pdstools.adm.ADMDatamart._apply_query()
facets (Union[str, list], deafult = None) – Please refer to
_generateFacets()
- Keyword Arguments:
round (int, default = 5) – To how many digits to round the hover data
plotting_engine (str) – ‘plotly’ or a custom plot class
return_df (bool) – If set to True, returns the dataframe instead of the plot Can be useful for debugging or replicating the plots
- Return type:
plotly.graph_objs.FigureWidget
Notes
See the docs for the plotly plots (plots_plotly.py) to see further parameters for this plot.
- Return type:
go.FigureWidget
- Parameters:
last (bool)
add_bottom_left_text (bool)
query (Optional[Union[polars.Expr, str, Dict[str, list]]])
facets (Union[str, list])
- plotOverTime(metric: str = 'Performance', by: str = 'ModelID', every: int = '1d', query: polars.Expr | str | Dict[str, list] | None = None, facets: str | list = None, mode: str = 'Diff', **kwargs) plotly.graph_objs.FigureWidget ¶
Plots a given metric over time
- Parameters:
metric (str, default = Performance) – The metric to plot over time. One of the following: {ResponseCount, Performance, SuccessRate, Positives, weighted_performance}
by (str, default = ModelID) – What variable to group the data by One of {ModelID, Name}
every (int, default = 1d) – How often to consider the metrics
query (Optional[Union[pl.Expr, str, Dict[str, list]]]) – Please refer to
pdstools.adm.ADMDatamart._apply_query()
facets (Union[str, list], deafult = None) – Please refer to
_generateFacets()
mode (str, default = Diff) – The plotting mode. Should be one of the following: - ‘Diff’: Plot differences over a specified period. - ‘Cumulative’: Plot time series plot of the values as is.
- Keyword Arguments:
plotting_engine (str) – ‘plotly’ or a custom plot class
return_df (bool) – If set to True, returns the dataframe instead of the plot Can be useful for debugging or replicating the plots
- Return type:
plotly.graph_objs.FigureWidget
Notes
See the docs for the plotly plots (plots_plotly.py) to see further parameters for this plot.
- Return type:
go.FigureWidget
- Parameters:
metric (str)
by (str)
every (int)
query (Optional[Union[polars.Expr, str, Dict[str, list]]])
facets (Union[str, list])
mode (str)
- plotPropositionSuccessRates(metric: str = 'SuccessRate', by: str = 'Name', show_error: bool = True, top_n=0, query: polars.Expr | str | Dict[str, list] | None = None, facets: str | list = None, **kwargs) plotly.graph_objs.FigureWidget ¶
Plots all latest proposition success rates
- Parameters:
metric (str, default = SuccessRate) – Can be changed to plot a different metric
by (str, default = Name) – What variable to group the data by One of {ModelID, Name}
show_error (bool, default = True) – Whether to show error bars in the bar plots
top_n (int, default = 0) – The number of rows to include in the pivoted DataFrame. If set to 0, all rows are included.
query (Optional[Union[pl.Expr, str, Dict[str, list]]]) – Please refer to
pdstools.adm.ADMDatamart._apply_query()
facets (Union[str, list], deafult = None) – Please refer to
_generateFacets()
- Keyword Arguments:
plotting_engine (str) – ‘plotly’ or a custom plot class
return_df (bool) – If set to True, returns the dataframe instead of the plot Can be useful for debugging or replicating the plots
- Return type:
plotly.graph_objs.FigureWidget
Notes
See the docs for the plotly plots (plots_plotly.py) to see further parameters for this plot.
- Return type:
go.FigureWidget
- Parameters:
metric (str)
by (str)
show_error (bool)
query (Optional[Union[polars.Expr, str, Dict[str, list]]])
facets (Union[str, list])
- plotScoreDistribution(by: str = 'ModelID', *, show_zero_responses: bool = False, modelids: List | None = None, query: polars.Expr | str | Dict[str, list] | None = None, show_each=False, **kwargs) plotly.graph_objs.FigureWidget ¶
Plots the score distribution, similar to OOTB
- Parameters:
by (str, default = Name) – What variable to group the data by One of {ModelID, Name}
show_zero_responses (bool)
modelids (Optional[List])
query (Optional[Union[polars.Expr, str, Dict[str, list]]])
- Keyword Arguments:
show_zero_responses (bool, default = False) – Whether to include bins with no responses at all
modelids (Optional[List], default = None) – Models to plot for. If multiple ids are given, returns a list of Plots for each model
query (Optional[Union[pl.Expr, str, Dict[str, list]]]) – Please refer to
pdstools.adm.ADMDatamart._apply_query()
show_each (bool) – Whether to show each file when multiple facets are used
plotting_engine (str) – ‘plotly’ or a custom plot class
return_df (bool) – If set to True, returns the dataframe instead of the plot Can be useful for debugging or replicating the plots
- Return type:
plotly.graph_objs.FigureWidget
Notes
See the docs for the plotly plots (plots_plotly.py) to see further parameters for this plot.
- Return type:
go.FigureWidget
- Parameters:
by (str)
show_zero_responses (bool)
modelids (Optional[List])
query (Optional[Union[polars.Expr, str, Dict[str, list]]])
- plotPredictorBinning(predictors: list = None, modelids: list = None, show_each=False, query: polars.Expr | str | Dict[str, list] | None = None, **kwargs) plotly.graph_objs.FigureWidget ¶
Plots the binning of given predictors
- Parameters:
predictors (list, default = None) – An optional list of predictors to plot the bins for Useful for plotting one or more variables over multiple models
modelids (list, default = None) – An optional list of model ids to plot the predictors for
query (Optional[Union[pl.Expr, str, Dict[str, list]]]) – Please refer to
pdstools.adm.ADMDatamart._apply_query()
- Keyword Arguments:
plotting_engine (str) – ‘plotly’ or a custom plot class
return_df (bool) – If set to True, returns the dataframe instead of the plot Can be useful for debugging or replicating the plots
- Return type:
plotly.graph_objs.FigureWidget
Notes
See the docs for the plotly plots (plots_plotly.py) to see further parameters for this plot.
- Return type:
go.FigureWidget
- Parameters:
predictors (list)
modelids (list)
query (Optional[Union[polars.Expr, str, Dict[str, list]]])
- plotPredictorPerformance(top_n: int = 0, active_only: bool = False, to_plot='Performance', query: polars.Expr | str | Dict[str, list] | None = None, facets: str | list = None, **kwargs) plotly.graph_objs.FigureWidget ¶
Plots a bar chart of the performance of the predictors
By default, this plot shows the performance over all models Use the querying functionality to drill down into a more specific subset. Picks top n predictors with highest weighted average Performance accross models and then sorts the predictors according to the median value.
- Parameters:
top_n (int, default = 0) – How many of the top predictors to show in the plot
active_only (bool, default = False) – Whether to only plot active predictors
to_plot (str, default = Performance) – Metric to compare predictors
query (Optional[Union[pl.Expr, str, Dict[str, list]]]) – Please refer to
pdstools.adm.ADMDatamart._apply_query()
facets (Union[str, list], deafult = None) – Please refer to
_generateFacets()
- Keyword Arguments:
plotting_engine (str) – ‘plotly’ or a custom plot class
return_df (bool) – If set to True, returns the dataframe instead of the plot Can be useful for debugging or replicating the plots
- Return type:
plotly.graph_objs.FigureWidget
Notes
See the docs for the plotly plots (plots_plotly.py) to see further parameters for this plot.
- Return type:
go.FigureWidget
- Parameters:
top_n (int)
active_only (bool)
query (Optional[Union[polars.Expr, str, Dict[str, list]]])
facets (Union[str, list])
- plotPredictorCategoryPerformance(active_only: bool = False, to_plot='Performance', query: polars.Expr | str | Dict[str, list] | None = None, facets: str | list = None, **kwargs) plotly.graph_objs.FigureWidget ¶
Plots a bar chart of the performance of the predictor categories
By default, this plot shows the performance over all models Use the querying functionality to drill down into a more specific subset
- Parameters:
active_only (bool, default = False) – Whether to only plot active predictors
to_plot (str, default = Performance) – Metric to compare predictor categories
query (Optional[Union[pl.Expr, str, Dict[str, list]]]) – Please refer to
pdstools.adm.ADMDatamart._apply_query()
facets (Union[str, list], deafult = None) – Please refer to
_generateFacets()
- Keyword Arguments:
plotting_engine (str) – ‘plotly’ or a custom plot class
separate (bool) – If set to true, dataset is subsetted using the facet column, creating seperate plots
return_df (bool) – If set to True, returns the dataframe instead of the plot Can be useful for debugging or replicating the plots
- Return type:
plotly.graph_objs.FigureWidget
Notes
See the docs for the plotly plots (plots_plotly.py) to see further parameters for this plot.
- Return type:
go.FigureWidget
- Parameters:
active_only (bool)
query (Optional[Union[polars.Expr, str, Dict[str, list]]])
facets (Union[str, list])
- plotPredictorContribution(by: str = 'Configuration', query: polars.Expr | str | Dict[str, list] | None = None, **kwargs) plotly.graph_objs.FigureWidget ¶
Plots the contribution of each predictor across a group
- Parameters:
by (str, default = Configuration) – The column to group the bars with
query (Optional[Union[pl.Expr, str, Dict[str, list]]]) – Please refer to
pdstools.adm.ADMDatamart._apply_query()
- Keyword Arguments:
predictorCategorization (pl.Expr) – An optional override for the predictor categorization function
plotting_engine (str) – This chart is only supported in plotly
return_df (bool) – If set to True, returns the dataframe instead of the plot Can be useful for debugging or replicating the plots
- Return type:
plotly.graph_objs.FigureWidget
Notes
See the docs for the plotly plots (plots_plotly.py). Plotly has an additional post_plot function defining some more actions, such as writing to html automatically or displaying figures while facetting.
- Return type:
go.FigureWidget
- Parameters:
by (str)
query (Optional[Union[polars.Expr, str, Dict[str, list]]])
- plotPredictorPerformanceHeatmap(top_n: int = 20, by='Name', active_only: bool = False, query: polars.Expr | str | Dict[str, list] | None = None, facets: list = None, **kwargs) plotly.graph_objs.FigureWidget ¶
Plots heatmap of the performance of the predictors
By default, this plot shows the performance over all models Use the querying functionality to drill down into a more specific subset
- Parameters:
top_n (int, default = 0) – How many of the top predictors to show in the plot
by (str, default = Name) – The column to use at the x axis of the heatmap
active_only (bool, default = False) – Whether to only plot active predictors
query (Optional[Union[pl.Expr, str, Dict[str, list]]]) – Please refer to
pdstools.adm.ADMDatamart._apply_query()
facets (Union[str, list], deafult = None) – Please refer to
_generateFacets()
- Keyword Arguments:
plotting_engine (str) – ‘plotly’ or a custom plot class
return_df (bool) – If set to True, returns the dataframe instead of the plot Can be useful for debugging or replicating the plots
- Return type:
plotly.graph_objs.FigureWidget
Notes
See the docs for the plotly plots (plots_plotly.py) to see further parameters for this plot.
- Return type:
go.FigureWidget
- Parameters:
top_n (int)
active_only (bool)
query (Optional[Union[polars.Expr, str, Dict[str, list]]])
facets (list)
- plotResponseGain(by: str = 'Channel', query: polars.Expr | str | Dict[str, list] | None = None, facets=None, **kwargs) plotly.graph_objs.FigureWidget ¶
Plots the cumulative response per model
- Parameters:
by (str, default = Channel) – The column by which to calculate response gain Default is Channel, to see the response/gain chart per channel
query (Optional[Union[pl.Expr, str, Dict[str, list]]]) – Please refer to
pdstools.adm.ADMDatamart._apply_query()
facets (Union[str, list], deafult = None) – Please refer to
_generateFacets()
- Keyword Arguments:
plotting_engine (str) – This chart is only supported in plotly
return_df (bool) – If set to True, returns the dataframe instead of the plot Can be useful for debugging or replicating the plots
- Return type:
plotly.graph_objs.FigureWidget
Notes
See the docs for the plotly plots (plots_plotly.py). Plotly has an additional post_plot function defining some more actions, such as writing to html automatically or displaying figures while facetting.
- Return type:
go.FigureWidget
- Parameters:
by (str)
query (Optional[Union[polars.Expr, str, Dict[str, list]]])
- plotModelsByPositives(by: str = 'Channel', query: polars.Expr | str | Dict[str, list] | None = None, facets=None, **kwargs) plotly.graph_objs.FigureWidget ¶
Plots the percentage of models vs the number of positive responses
- Parameters:
by (str, default = Channel) – The column to calculate the model percentage by
query (Optional[Union[pl.Expr, str, Dict[str, list]]]) – Please refer to
pdstools.adm.ADMDatamart._apply_query()
facets (Union[str, list], deafult = None) – Please refer to
_generateFacets()
- Keyword Arguments:
plotting_engine (str) – This chart is only supported in plotly
return_df (bool) – If set to True, returns the dataframe instead of the plot Can be useful for debugging or replicating the plots
- Return type:
plotly.graph_objs.FigureWidget
Notes
See the docs for the plotly plots (plots_plotly.py). Plotly has an additional post_plot function defining some more actions, such as writing to html automatically or displaying figures while facetting.
- Return type:
go.FigureWidget
- Parameters:
by (str)
query (Optional[Union[polars.Expr, str, Dict[str, list]]])
- plotTreeMap(color_var: str = 'performance_weighted', by: str = 'ModelID', value_in_text: bool = True, midpoint: float | None = None, query: polars.Expr | str | Dict[str, list] | None = None, **kwargs) plotly.graph_objs.FigureWidget ¶
Plots a treemap to view performance over multiple context keys
- Parameters:
color (str, default = performance_weighted) – The column to set as the color of the squares One out of: {responsecount, responsecount_log, positives, positives_log, percentage_without_responses, performance_weighted, successrate}
by (str, default = Channel) – The column to use as the size of the squares
value_in_text (bool, default = True) – Whether to print the values of the swuares in the squares
midpoint (Optional[float]) – A parameter to assert more control over the color distribution Set near 0 to give lower values a ‘higher’ color Set near 1 to give higher values a ‘lower’ color Necessary for, for example, Success Rate, where rates lie very far apart If not supplied in such cases, there is no difference in the color between low values such as 0.001 and 0.1, so midpoint should be set low
query (Optional[Union[pl.Expr, str, Dict[str, list]]]) – Please refer to
pdstools.adm.ADMDatamart._apply_query()
facets (Union[str, list], deafult = None) – Please refer to
_generateFacets()
color_var (str)
- Keyword Arguments:
colorscale (list) – Give a list of hex values to override the default colors Should consist of three colors: ‘low’, ‘neutral’ and ‘high’
plotting_engine (str) – This chart is only supported in plotly
return_df (bool) – If set to True, returns the dataframe instead of the plot Can be useful for debugging or replicating the plots
- Return type:
plotly.graph_objs.FigureWidget
Notes
See the docs for the plotly plots (plots_plotly.py). Plotly has an additional post_plot function defining some more actions, such as writing to html automatically or displaying figures while facetting.
- Return type:
go.FigureWidget
- Parameters:
color_var (str)
by (str)
value_in_text (bool)
midpoint (Optional[float])
query (Optional[Union[polars.Expr, str, Dict[str, list]]])
- plotPredictorCount(facets: str | list, query: polars.Expr | str | Dict[str, list] | None = None, by: str = 'Type', **kwargs)¶
- Parameters:
facets (Union[str, list])
query (Optional[Union[polars.Expr, str, Dict[str, list]]])
by (str)
- plotBinningLift(binning, col_facet=None, row_facet=None, custom_data=['PredictorName', 'BinSymbol'], return_df=False) polars.DataFrame | plotly.graph_objects.Figure ¶
- Return type:
Union[polars.DataFrame, plotly.graph_objects.Figure]