pdstools.infinity.resources.prediction_studio.local_model_utils

Attributes

Exceptions

ONNXModelCreationError

Exception for errors during ONNX conversion and save.

ONNXModelValidationError

Exception for errors during ONNX validation.

Classes

OutcomeType

Create a collection of name/value pairs.

Predictor

A single predictor (feature) in an ONNX model.

Output

!!! abstract "Usage Documentation"

Metadata

!!! abstract "Usage Documentation"

PMMLModel

!!! abstract "Usage Documentation"

H2OModel

!!! abstract "Usage Documentation"

ONNXModel

!!! abstract "Usage Documentation"

Module Contents

logger
PEGA_METADATA = 'pegaMetadata'
class OutcomeType(*args, **kwds)

Bases: enum.Enum

Create a collection of name/value pairs.

Example enumeration:

>>> class Color(Enum):
...     RED = 1
...     BLUE = 2
...     GREEN = 3

Access them by:

  • attribute access:

>>> Color.RED
<Color.RED: 1>
  • value lookup:

>>> Color(1)
<Color.RED: 1>
  • name lookup:

>>> Color['RED']
<Color.RED: 1>

Enumerations can be iterated over, and know how many members they have:

>>> len(Color)
3
>>> list(Color)
[<Color.RED: 1>, <Color.BLUE: 2>, <Color.GREEN: 3>]

Methods can be added to enumerations, and members can have their own attributes – see the documentation for details.

BINARY = 'binary'
CATEGORICAL = 'categorical'
CONTINUOUS = 'continuous'
class Predictor(/, **data: Any)

Bases: pydantic.BaseModel

A single predictor (feature) in an ONNX model.

Automatic name derivation — When pega_property is supplied (e.g. ".Customer.Age") and name is omitted, the predictor name is automatically set to the leaf segment of the property path ("Age"). Pega Prediction Studio auto‑maps predictors by matching name against its data‑model properties, so this guarantees correct field mapping on upload without any manual work.

If name is provided explicitly it is always used as‑is.

See also

Metadata.build_predictor_list

Batch‑build predictors from a list of Pega property paths or plain feature names.

Parameters:

data (Any)

name: str | None = None
index: int | None = None
input_name: str | None = None
data_type: str = None
pega_property: str | None = None
classmethod _derive_name_from_pega_property(data)

Auto‑derive name from the leaf of pega_property.

Runs before individual field validators so that every downstream validator already sees a resolved name.

validate_input_name(v)
validate_name(v)
validate_index(v)
validate_data_type(v)
class Output(/, **data: Any)

Bases: pydantic.BaseModel

!!! abstract “Usage Documentation”

[Models](../concepts/models.md)

A base class for creating Pydantic models.

Attributes:

__class_vars__: The names of the class variables defined on the model. __private_attributes__: Metadata about the private attributes of the model. __signature__: The synthesized __init__ [Signature][inspect.Signature] of the model.

__pydantic_complete__: Whether model building is completed, or if there are still undefined fields. __pydantic_core_schema__: The core schema of the model. __pydantic_custom_init__: Whether the model has a custom __init__ function. __pydantic_decorators__: Metadata containing the decorators defined on the model.

This replaces Model.__validators__ and Model.__root_validators__ from Pydantic V1.

__pydantic_generic_metadata__: Metadata for generic models; contains data used for a similar purpose to

__args__, __origin__, __parameters__ in typing-module generics. May eventually be replaced by these.

__pydantic_parent_namespace__: Parent namespace of the model, used for automatic rebuilding of models. __pydantic_post_init__: The name of the post-init method for the model, if defined. __pydantic_root_model__: Whether the model is a [RootModel][pydantic.root_model.RootModel]. __pydantic_serializer__: The pydantic-core SchemaSerializer used to dump instances of the model. __pydantic_validator__: The pydantic-core SchemaValidator used to validate instances of the model.

__pydantic_fields__: A dictionary of field names and their corresponding [FieldInfo][pydantic.fields.FieldInfo] objects. __pydantic_computed_fields__: A dictionary of computed field names and their corresponding [ComputedFieldInfo][pydantic.fields.ComputedFieldInfo] objects.

__pydantic_extra__: A dictionary containing extra values, if [extra][pydantic.config.ConfigDict.extra]

is set to ‘allow’.

__pydantic_fields_set__: The names of fields explicitly set during instantiation. __pydantic_private__: Values of private attributes set on the model instance.

Parameters:

data (Any)

possible_values: list[str | int | float] = None
label_name: str | None = None
score_name: str | None = None
min_value: float | None = None
max_value: float | None = None
validate_label_name(v)
class Metadata(/, **data: Any)

Bases: pydantic.BaseModel

!!! abstract “Usage Documentation”

[Models](../concepts/models.md)

A base class for creating Pydantic models.

Attributes:

__class_vars__: The names of the class variables defined on the model. __private_attributes__: Metadata about the private attributes of the model. __signature__: The synthesized __init__ [Signature][inspect.Signature] of the model.

__pydantic_complete__: Whether model building is completed, or if there are still undefined fields. __pydantic_core_schema__: The core schema of the model. __pydantic_custom_init__: Whether the model has a custom __init__ function. __pydantic_decorators__: Metadata containing the decorators defined on the model.

This replaces Model.__validators__ and Model.__root_validators__ from Pydantic V1.

__pydantic_generic_metadata__: Metadata for generic models; contains data used for a similar purpose to

__args__, __origin__, __parameters__ in typing-module generics. May eventually be replaced by these.

__pydantic_parent_namespace__: Parent namespace of the model, used for automatic rebuilding of models. __pydantic_post_init__: The name of the post-init method for the model, if defined. __pydantic_root_model__: Whether the model is a [RootModel][pydantic.root_model.RootModel]. __pydantic_serializer__: The pydantic-core SchemaSerializer used to dump instances of the model. __pydantic_validator__: The pydantic-core SchemaValidator used to validate instances of the model.

__pydantic_fields__: A dictionary of field names and their corresponding [FieldInfo][pydantic.fields.FieldInfo] objects. __pydantic_computed_fields__: A dictionary of computed field names and their corresponding [ComputedFieldInfo][pydantic.fields.ComputedFieldInfo] objects.

__pydantic_extra__: A dictionary containing extra values, if [extra][pydantic.config.ConfigDict.extra]

is set to ‘allow’.

__pydantic_fields_set__: The names of fields explicitly set during instantiation. __pydantic_private__: Values of private attributes set on the model instance.

Parameters:

data (Any)

type: OutcomeType | None = None
predictor_list: list[Predictor] = None
output: Output | None = None
modeling_technique: str | None = None
internal: bool | None = None
file_source: str | None = None
objective: str | None = None
rule_set: str | None = None
rule_set_version: str | None = None
predict_method_uses_name_value_pair: bool | None = None
model_version: str | None = None
created_by: str | None = None
created_date: str | None = None
last_modified_date: str | None = None
training_dataset: str | None = None
experiment_id: str | None = None
parent_model_id: str | None = None
baseline_auc: float | None = None
baseline_accuracy: float | None = None
performance_threshold: float | None = None
validate_type(v, values)
validate_output(v, values)
to_json() str
Return type:

str

classmethod from_json(json_str: str) Metadata
Parameters:

json_str (str)

Return type:

Metadata

static build_predictor_list(features: list[str], input_name: str = 'features', *, data_types: list[str] | dict[str, str] | None = None) list[Predictor]

Build a predictor list from feature names or Pega property paths.

This is the recommended one‑liner for constructing predictors. Each entry in features can be:

  • A Pega property path starting with "." (e.g. ".Customer.Age"). The leaf segment becomes the predictor name ("Age") and the full path is stored as pega_property so Pega Prediction Studio auto‑maps the field on upload.

  • A plain feature name (e.g. "Age"). Used as‑is for name; pega_property is left unset.

Indices are assigned automatically (1‑based) in the order the features appear.

Parameters:
  • features (list[str]) – Ordered list of feature identifiers. The order must match the column order in the ONNX input tensor. Accepts any mix of plain names and Pega property paths.

  • input_name (str) – Name of the ONNX input node. Default "features".

  • data_types (list[str] | dict[str, str] | None) –

    Optional type annotations for features. Accepts either:

    • A list of "Numeric" / "Symbolic" values, one per feature (must match features length).

    • A dict mapping feature names (the leaf segment for Pega paths) to "Numeric" or "Symbolic". Only the features present in the dict are overridden; the rest default to "Numeric".

    When None every feature defaults to "Numeric".

Return type:

list[Predictor]

Examples

>>> # Using Pega property paths (recommended for auto-mapping):
>>> Metadata.build_predictor_list(
...     [".Customer.Age", ".Customer.Tenure", ".Customer.MonthlyCharges"],
... )  
>>> # Using plain feature names:
>>> Metadata.build_predictor_list(["Age", "Tenure", "MonthlyCharges"])
... 
>>> # Sparse overrides via dict:
>>> Metadata.build_predictor_list(
...     [".Customer.Age", ".Customer.ContractType"],
...     data_types={"ContractType": "Symbolic"},
... )  
static _convert_keys(data: dict, conversion_func) dict
Parameters:

data (dict)

Return type:

dict

static _to_snake_case(string: str) str
Parameters:

string (str)

Return type:

str

static _to_camel_case(string: str) str
Parameters:

string (str)

Return type:

str

class _ONNXMetadataEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)

Bases: json.JSONEncoder

Extensible JSON <https://json.org> encoder for Python data structures.

Supports the following objects and types by default:

Python

JSON

dict

object

list, tuple

array

str

string

int, float

number

True

true

False

false

None

null

To extend this to recognize other objects, subclass and implement a .default() method with another method that returns a serializable object for o if possible, otherwise it should call the superclass implementation (to raise TypeError).

default(o)

Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).

For example, to support arbitrary iterators, you could implement default like this:

def default(self, o):
    try:
        iterable = iter(o)
    except TypeError:
        pass
    else:
        return list(iterable)
    # Let the base class default method raise the TypeError
    return super().default(o)
static _strip_empty(data)

Recursively remove keys whose value is [].

exception ONNXModelCreationError

Bases: Exception

Exception for errors during ONNX conversion and save.

exception ONNXModelValidationError

Bases: pdstools.infinity.resources.prediction_studio.base.ModelValidationError

Exception for errors during ONNX validation.

class PMMLModel(file_path: str)

Bases: pdstools.infinity.resources.prediction_studio.base.LocalModel

!!! abstract “Usage Documentation”

[Models](../concepts/models.md)

A base class for creating Pydantic models.

Attributes:

__class_vars__: The names of the class variables defined on the model. __private_attributes__: Metadata about the private attributes of the model. __signature__: The synthesized __init__ [Signature][inspect.Signature] of the model.

__pydantic_complete__: Whether model building is completed, or if there are still undefined fields. __pydantic_core_schema__: The core schema of the model. __pydantic_custom_init__: Whether the model has a custom __init__ function. __pydantic_decorators__: Metadata containing the decorators defined on the model.

This replaces Model.__validators__ and Model.__root_validators__ from Pydantic V1.

__pydantic_generic_metadata__: Metadata for generic models; contains data used for a similar purpose to

__args__, __origin__, __parameters__ in typing-module generics. May eventually be replaced by these.

__pydantic_parent_namespace__: Parent namespace of the model, used for automatic rebuilding of models. __pydantic_post_init__: The name of the post-init method for the model, if defined. __pydantic_root_model__: Whether the model is a [RootModel][pydantic.root_model.RootModel]. __pydantic_serializer__: The pydantic-core SchemaSerializer used to dump instances of the model. __pydantic_validator__: The pydantic-core SchemaValidator used to validate instances of the model.

__pydantic_fields__: A dictionary of field names and their corresponding [FieldInfo][pydantic.fields.FieldInfo] objects. __pydantic_computed_fields__: A dictionary of computed field names and their corresponding [ComputedFieldInfo][pydantic.fields.ComputedFieldInfo] objects.

__pydantic_extra__: A dictionary containing extra values, if [extra][pydantic.config.ConfigDict.extra]

is set to ‘allow’.

__pydantic_fields_set__: The names of fields explicitly set during instantiation. __pydantic_private__: Values of private attributes set on the model instance.

Parameters:

file_path (str)

file_path: str
get_file_path() str

Returns the file path of the model.

Returns:

str

Return type:

The file path of the model.

class H2OModel(file_path: str)

Bases: pdstools.infinity.resources.prediction_studio.base.LocalModel

!!! abstract “Usage Documentation”

[Models](../concepts/models.md)

A base class for creating Pydantic models.

Attributes:

__class_vars__: The names of the class variables defined on the model. __private_attributes__: Metadata about the private attributes of the model. __signature__: The synthesized __init__ [Signature][inspect.Signature] of the model.

__pydantic_complete__: Whether model building is completed, or if there are still undefined fields. __pydantic_core_schema__: The core schema of the model. __pydantic_custom_init__: Whether the model has a custom __init__ function. __pydantic_decorators__: Metadata containing the decorators defined on the model.

This replaces Model.__validators__ and Model.__root_validators__ from Pydantic V1.

__pydantic_generic_metadata__: Metadata for generic models; contains data used for a similar purpose to

__args__, __origin__, __parameters__ in typing-module generics. May eventually be replaced by these.

__pydantic_parent_namespace__: Parent namespace of the model, used for automatic rebuilding of models. __pydantic_post_init__: The name of the post-init method for the model, if defined. __pydantic_root_model__: Whether the model is a [RootModel][pydantic.root_model.RootModel]. __pydantic_serializer__: The pydantic-core SchemaSerializer used to dump instances of the model. __pydantic_validator__: The pydantic-core SchemaValidator used to validate instances of the model.

__pydantic_fields__: A dictionary of field names and their corresponding [FieldInfo][pydantic.fields.FieldInfo] objects. __pydantic_computed_fields__: A dictionary of computed field names and their corresponding [ComputedFieldInfo][pydantic.fields.ComputedFieldInfo] objects.

__pydantic_extra__: A dictionary containing extra values, if [extra][pydantic.config.ConfigDict.extra]

is set to ‘allow’.

__pydantic_fields_set__: The names of fields explicitly set during instantiation. __pydantic_private__: Values of private attributes set on the model instance.

Parameters:

file_path (str)

file_path: str
get_file_path() str

Returns the file path of the model.

Returns:

str

Return type:

The file path of the model.

class ONNXModel(model: onnx.ModelProto)

Bases: pdstools.infinity.resources.prediction_studio.base.LocalModel

!!! abstract “Usage Documentation”

[Models](../concepts/models.md)

A base class for creating Pydantic models.

Attributes:

__class_vars__: The names of the class variables defined on the model. __private_attributes__: Metadata about the private attributes of the model. __signature__: The synthesized __init__ [Signature][inspect.Signature] of the model.

__pydantic_complete__: Whether model building is completed, or if there are still undefined fields. __pydantic_core_schema__: The core schema of the model. __pydantic_custom_init__: Whether the model has a custom __init__ function. __pydantic_decorators__: Metadata containing the decorators defined on the model.

This replaces Model.__validators__ and Model.__root_validators__ from Pydantic V1.

__pydantic_generic_metadata__: Metadata for generic models; contains data used for a similar purpose to

__args__, __origin__, __parameters__ in typing-module generics. May eventually be replaced by these.

__pydantic_parent_namespace__: Parent namespace of the model, used for automatic rebuilding of models. __pydantic_post_init__: The name of the post-init method for the model, if defined. __pydantic_root_model__: Whether the model is a [RootModel][pydantic.root_model.RootModel]. __pydantic_serializer__: The pydantic-core SchemaSerializer used to dump instances of the model. __pydantic_validator__: The pydantic-core SchemaValidator used to validate instances of the model.

__pydantic_fields__: A dictionary of field names and their corresponding [FieldInfo][pydantic.fields.FieldInfo] objects. __pydantic_computed_fields__: A dictionary of computed field names and their corresponding [ComputedFieldInfo][pydantic.fields.ComputedFieldInfo] objects.

__pydantic_extra__: A dictionary containing extra values, if [extra][pydantic.config.ConfigDict.extra]

is set to ‘allow’.

__pydantic_fields_set__: The names of fields explicitly set during instantiation. __pydantic_private__: Values of private attributes set on the model instance.

Parameters:

model (onnx.ModelProto)

_model: onnx.ModelProto
classmethod from_onnx_proto(model: onnx.ModelProto) ONNXModel

Creates an ONNXModel object.

Parameters:

model (ModelProto) – An onnx ModelProto object

Returns:

An instance of the ONNXModel class.

Return type:

ONNXModel

Raises:
classmethod from_sklearn_pipeline(model: sklearn.pipeline.Pipeline, initial_types: list) ONNXModel

Creates an ONNXModel object.

Parameters:
  • model (Pipeline) – A sklearn Pipeline object

  • initial_types (list) – A list of initial types for the model’s input variables if the model is a Sklearn Pipeline object.

Returns:

An instance of the ONNXModel class.

Return type:

ONNXModel

Raises:
classmethod from_pytorch(model, dummy_input, *, input_names: list[str] | None = None, output_names: list[str] | None = None, opset_version: int = 17, fixed_batch_size: bool = True) ONNXModel

Create an ONNXModel from a PyTorch nn.Module.

The model is exported via torch.onnx.export with static shapes. When fixed_batch_size is True (the default), any remaining dynamic dimensions are replaced with a batch size of 1 so that the resulting ONNX file is accepted by Pega Prediction Studio.

Parameters:
  • model – A PyTorch nn.Module (already in eval mode is recommended).

  • dummy_input – Example input tensor(s) matching the model’s forward signature.

  • input_names (list[str] | None) – Optional list of ONNX input node names. Defaults to ["input"].

  • output_names (list[str] | None) – Optional list of ONNX output node names. Defaults to ["output"].

  • opset_version (int) – ONNX opset version. Default 17.

  • fixed_batch_size (bool) – Replace dynamic dimensions with batch size 1.

Return type:

ONNXModel

Raises:

ONNXModelCreationError – If PyTorch is not installed or the export fails.

static _fix_dynamic_shapes(proto: onnx.ModelProto, batch_size: int = 1) None

Replace dynamic (symbolic) dimensions with a fixed batch_size.

Pega Prediction Studio rejects ONNX models whose input or output nodes contain symbolic dimension parameters (e.g. "batch"). This helper iterates every dimension on every input/output and replaces symbolic dims with the given integer value.

Parameters:
  • proto (onnx.ModelProto) – An onnx.ModelProtomodified in place.

  • batch_size (int) – The integer value to substitute. Default 1.

Return type:

None

get_metadata() Metadata | None

Return the embedded Metadata or None if absent.

Return type:

Metadata | None

add_metadata(metadata: Metadata) ONNXModel

Adds metadata to the ONNX model.

Parameters:

metadata (Meta) – The metadata to be added.

Returns:

The ONNXModel object with the added metadata.

Return type:

ONNXModel

Raises:

ImportError – If the optional dependencies for ONNX Metadata addition are not installed.

validate() bool

Validates an ONNX model.

Raises:
  • ImportError – If the optional dependencies for ONNX Validation are not installed.:

  • ONNXModelValidationError – If the model is invalid or if the validation process fails.:

Return type:

bool

run(test_data: dict)

Run the prediction using the provided test data.

Parameters:

test_data (dict) –

The test data to be used for prediction. It is a dictionary where each key is a column name from the dataset, and each value is a NumPy array representing the column data as a vector. For example:

{

‘column1’: array([[value1], [value2], [value3]]), ‘column2’: array([[value4], [value5], [value6]]), ‘column3’: array([[value7], [value8], [value9]])

}

Returns:

The prediction result.

Return type:

Any

save(onnx_file_path: str)

Saves the ONNX model to the specified file path.

Parameters:

onnx_file_path (str) – The file path where the ONNX model should be saved.

Raises:

ImportError – If the optional dependencies for ONNX Conversion are not installed.:

__check_for_valid_input_node_structure(error_stream, session, metadata) bool

Checks if the output node structure of the ONNX model is valid.

Parameters:
  • error_stream (Any) – The stream to which error messages are written.

  • session (onnxruntime.InferenceSession) – The ONNX runtime session containing the model.

  • metadata (Meta) – The metadata associated with the model.

Returns:

True if the output node structure is valid, False otherwise.

Return type:

bool

__validate_input_nodes(error_stream, model_input_info, metadata) bool

Validates the input nodes of the ONNX model.

Parameters:
  • error_stream (Any) – The stream to which error messages are written.

  • model_input_info (dict) – A dictionary containing information about the model’s input nodes.

  • metadata (Meta) – The metadata associated with the model.

Returns:

True if the input nodes are valid, False otherwise.

Return type:

bool

__check_for_valid_output_node_structure(error_stream, session, metadata) bool

Checks if the output node structure of the ONNX model is valid.

Parameters:
  • error_stream (Any) – The stream to which error messages are written.

  • session (onnxruntime.InferenceSession) – The ONNX runtime session containing the model.

  • metadata (Meta) – The metadata associated with the model.

Returns:

True if the output node structure is valid, False otherwise.

Return type:

bool

__validate_tensor_output_node_structure(error_stream, node_name, value_info) bool

Validates the tensor output node structure of the ONNX model.

Parameters:
  • error_stream (Any) – The stream to which error messages are written.

  • node_name (str) – The name of the output node.

  • value_info (Any) – The value information of the output node.

Returns:

True if the output node is of type Tensor, False otherwise.

Return type:

bool

__validate_label_output_node_exist(error_stream, model_output_info, metadata) bool

Validates the existence of the label output node in the ONNX model.

Parameters:
  • error_stream (Any) – The stream to which error messages are written.

  • model_output_info (dict) – A dictionary containing information about the model’s output nodes.

  • metadata (Meta) – The metadata associated with the model.

Returns:

True if the label output node exists, False otherwise.

Return type:

bool

__validate_input_node_dimensions(error_stream, model_input_info) bool

Validates the dimensions of the input nodes in the ONNX model.

Parameters:
  • error_stream (Any) – The stream to which error messages are written.

  • model_input_info (dict) – A dictionary containing information about the model’s input nodes.

Returns:

True if all input nodes have valid dimensions, False otherwise.

Return type:

bool

__validate_input_node_shapes(error_stream, model_input_info) bool

Validates the dimensions of the input nodes in the ONNX model.

Parameters:
  • error_stream (Any) – The stream to which error messages are written.

  • model_input_info (dict) – A dictionary containing information about the model’s input nodes.

Returns:

True if all input nodes have valid dimensions, False otherwise.

Return type:

bool

__validate_predictor_mappings(error_stream, model_input_info, metadata) bool

Validates the predictor mappings in the ONNX model.

Parameters:
  • error_stream (Any) – The stream to which error messages are written.

  • model_input_info (dict) – A dictionary containing information about the model’s input nodes.

  • metadata (Meta) – The metadata associated with the model.

Returns:

True if all predictor mappings are valid, False otherwise.

Return type:

bool

__validate_predictor_index_mappings(error_stream, metadata) bool

Validates the predictor index mappings in the ONNX model’s metadata.

Parameters:
  • error_stream (Any) – The stream to which error messages are written.

  • metadata (Meta) – The metadata associated with the model.

Returns:

True if the predictor index mappings are valid, False otherwise.

Return type:

bool

__create_predictor_map(metadata)

Creates a mapping of input names to their corresponding predictors.

Parameters:

metadata (Meta) – The metadata associated with the model, containing the predictor list.

Returns:

A dictionary where the keys are input names and the values are lists of predictors.

Return type:

dict

__validate_input_node_sizes(error_stream, model_input_info, metadata) bool

Validates the sizes of the input nodes in the ONNX model.

Parameters:
  • error_stream (Any) – The stream to which error messages are written.

  • model_input_info (dict) – A dictionary containing information about the model’s input nodes.

  • metadata (Meta) – The metadata associated with the model, containing the predictor list.

Returns:

True if all input nodes have valid sizes, False otherwise.

Return type:

bool

__get_missing_predictors(model_input_info, predictor_input_names) str

Identifies the missing predictors in the ONNX model’s input nodes.

Parameters:
  • model_input_info (dict) – A dictionary containing information about the model’s input nodes.

  • predictor_input_names (list) – A list of predictor input names.

Returns:

A comma-separated string of missing predictor names.

Return type:

str