pdstools.adm.Plots._predictors¶
Predictor-level performance, contribution, heatmap, and count plots.
Classes¶
Common attribute surface used by every plot mixin. |
Module Contents¶
- class _PredictorPlotsMixin¶
Bases:
pdstools.adm.Plots._base._PlotsBaseCommon attribute surface used by every plot mixin.
- _boxplot_pre_aggregated(df: polars.LazyFrame, *, y_col: str, metric_col: str, metric_weight_col: str | None = None, legend_col: str | None = None, color_discrete_map: dict[str, str] | None = None, return_df: bool = False)¶
- predictor_performance(*, metric: str = 'Performance', top_n: int | None = None, active_only: bool = False, query: pdstools.utils.types.QUERY | None = None, return_df: bool = False)¶
Plots a box plot of the performance of the predictors
Use the query argument to drill down to a more specific subset If top n is given, chooses the top predictors based on the weighted average performance across models, ordered by their median performance.
- Parameters:
metric (str, optional) – The metric to plot, by default “Performance” This is more for future-proofing, once FeatureImportance gets more used.
top_n (Optional[int], optional) – The top n predictors to plot, by default None
active_only (bool, optional) – Whether to only consider active predictor performance, by default False
query (Optional[QUERY], optional) – The query to apply to the data, by default None
return_df (bool, optional) – Whether to return a dataframe instead of a plot, by default False
- Returns:
Plotly box plot figure, LazyFrame if return_df=True, or None if no data
- Return type:
Figure | pl.LazyFrame | None
See also
pdstools.adm.ADMDatamart.apply_predictor_categorizationhow to override the out of the box predictor categorization
Examples
>>> # Default: all predictors ranked by performance >>> fig = dm.plot.predictor_performance()
>>> # Top-15 active predictors only >>> fig = dm.plot.predictor_performance(top_n=15, active_only=True)
>>> # Filter to a specific channel and return the raw data >>> df = dm.plot.predictor_performance( ... query={"Channel": "Web"}, ... return_df=True, ... )
- predictor_category_performance(*, metric: str = 'Performance', active_only: bool = False, query: pdstools.utils.types.QUERY | None = None, return_df: bool = False)¶
Plot the predictor category performance
- Parameters:
metric (str, optional) – The metric to plot, by default “Performance”
active_only (bool, optional) – Whether to only analyze active predictors, by default False
query (Optional[QUERY], optional) – An optional query to apply, by default None
return_df (bool, optional) – An optional flag to get the dataframe instead, by default False
- Returns:
A Plotly figure
- Return type:
px.Figure
See also
pdstools.adm.ADMDatamart.apply_predictor_categorizationhow to override the out of the box predictor categorization
Examples
>>> # Default: performance box plot per predictor category >>> fig = dm.plot.predictor_category_performance()
>>> # Active predictors only, filtered to a specific channel >>> fig = dm.plot.predictor_category_performance( ... active_only=True, ... query={"Channel": "Web"}, ... )
>>> # Return underlying data for further analysis >>> df = dm.plot.predictor_category_performance(return_df=True)
- predictor_contribution(*, by: str = 'Configuration', query: pdstools.utils.types.QUERY | None = None, return_df: bool = False)¶
Plots the predictor contribution for each configuration
- Parameters:
- Returns:
A plotly figure
- Return type:
px.Figure
See also
pdstools.adm.ADMDatamart.apply_predictor_categorizationhow to override the out of the box predictor categorization
Examples
>>> # Default: contribution per Configuration >>> fig = dm.plot.predictor_contribution()
>>> # Contribution grouped by Channel >>> fig = dm.plot.predictor_contribution(by="Channel")
>>> # Return the contribution data for further processing >>> df = dm.plot.predictor_contribution(return_df=True)
- predictor_performance_heatmap(*, top_predictors: int = 20, top_groups: int | None = None, by: str = 'Name', active_only: bool = False, query: pdstools.utils.types.QUERY | None = None, return_df: bool = False)¶
Generate a heatmap showing predictor performance across different groups.
- Parameters:
top_predictors (int, optional) – Number of top-performing predictors to include, by default 20
top_groups (int, optional) – Number of top groups to include, by default None (all groups)
by (str, optional) – Column to group by for the heatmap, by default “Name”
active_only (bool, optional) – Whether to only include active predictors, by default False
query (Optional[QUERY], optional) – Optional query to filter the data, by default None
return_df (bool, optional) – Whether to return a dataframe instead of a plot, by default False
- Returns:
Plotly heatmap figure or DataFrame if return_df=True
- Return type:
Union[Figure, pl.LazyFrame]
Examples
>>> # Default: top-20 predictors vs proposition (Name) >>> fig = dm.plot.predictor_performance_heatmap()
>>> # Top-10 predictors across top-5 configurations, active only >>> fig = dm.plot.predictor_performance_heatmap( ... top_predictors=10, ... top_groups=5, ... by="Configuration", ... active_only=True, ... )
>>> # Return the pivot data for further processing >>> df = dm.plot.predictor_performance_heatmap(return_df=True)
- predictor_count(*, by: str | list[str] = ['EntryType', 'Type'], query: pdstools.utils.types.QUERY | None = None, return_df: bool = False)¶
Generate a box plot showing the distribution of predictor counts by type.
- Parameters:
- Returns:
Plotly box plot figure or DataFrame if return_df=True
- Return type:
Union[Figure, pl.LazyFrame]
Examples
>>> # Default: predictor count distribution by EntryType and Type >>> fig = dm.plot.predictor_count()
>>> # Distribution by EntryType only >>> fig = dm.plot.predictor_count(by="EntryType")
>>> # Return the raw counts >>> df = dm.plot.predictor_count(return_df=True)