Getting Started for Data Scientists

Installation

Quick Start

uv pip install pdstools

Instructions

Pega Data Scientist Tools (pdstools) is a public Python library and it is published on pypi. As such, you can install it just like any other Python library; using your package manager of choice.

Choose your preferred installation method:

We have a strong preference for uv as it’s fast, reliable, and handles Python versions automatically.

Why uv? uv automatically manages Python versions, creates isolated environments, and is significantly faster than traditional pip workflows.

Step 1: Install uv

If you haven’t yet, install uv from https://github.com/astral-sh/uv. We recommend using the standalone installer, as it has a uv self update function.

Step 2: Create a virtual environment

Navigate to your desired directory and run:

uv venv

Step 3: Install pdstools

uv pip install pdstools

For optional dependencies:

uv pip install 'pdstools[api]'

For Jupyter notebooks:

uv pip install ipykernel nbformat

Note: If you don’t have Python or no compatible version installed, uv will automatically install a compatible version for you.

Optional dependencies

We intentionally limit the number of big and heavy core dependencies. This means that while initial installation is very fast, you may at some points run into import errors and will be required to install additional dependency groups.

To install extra dependencies, you can put them in square brackets after a package name. For example, to install the optional dependencies required for using the API features of pdstools:

uv pip install 'pdstools[api]'

For an overview of all optional dependencies and the dependency groups they will be installed for, run the following code:

from pdstools.utils.show_versions import dependency_great_table

dependency_great_table()

Python compatibility

Even though uv takes care of installing your python version, sometimes you have no choice of available versions. For this reason, we try to be as supportive in Python versions as we can; so our latest supported python version depends on our core dependencies, particularly Polars. As of 2024, Polars supports Python version 3.9 and higher, hence so do we.

Checking the Installation

With pdstools[adm] installed, you can test that it’s installed.

If you want to run code from a python notebook, install the additional packages first:

uv pip install ipykernel nbformat

To create a ‘bubble chart’ on sample ADM data that plots action Success Rates vs Model Performance, similar to the one in Prediction Studio:

from pdstools import cdh_sample

cdh_sample().plot.bubble_chart()

Next Steps

To run these analyses over your own data, please refer to the ADMDatamart class documentation or for an example of how to use it, refer to the Example ADM Analysis.

To run the Stand-Alone Application, please refer to the Command Line Interface documentation or the ADM Health Check Article

For information on how to use the Infinity DX client, please refer to the Infinity class documentation or the Prediction Studio API Explainer article.

PDSTools supports analysis of several other Pega Data. Please see the Examples in the documentation.