Getting Started for Data Scientists¶
Installation¶
Quick Start
uv pip install pdstools
Instructions
Pega Data Scientist Tools (pdstools) is a public Python library and it is published on pypi. As such, you can install it just like any other Python library; using your package manager of choice.
Choose your preferred installation method:
We have a strong preference for uv as it’s fast, reliable, and handles Python versions automatically.
Why uv? uv automatically manages Python versions, creates isolated environments, and is significantly faster than traditional pip workflows.
Step 1: Install uv
If you haven’t yet, install uv from https://github.com/astral-sh/uv. We recommend using the standalone installer, as it has a uv self update function.
Step 2: Create a virtual environment
Navigate to your desired directory and run:
uv venv
Step 3: Install pdstools
uv pip install pdstools
For optional dependencies:
uv pip install 'pdstools[api]'
For Jupyter notebooks:
uv pip install ipykernel nbformat
Note: If you don’t have Python or no compatible version installed, uv will automatically install a compatible version for you.
This is the traditional Python approach using pip with virtual environments.
Step 1: Create a virtual environment
Navigate to your desired directory and run:
python -m venv .venv
Step 2: Activate the virtual environment
On macOS/Linux:
source .venv/bin/activate
On Windows:
.venv\Scripts\activate
Step 3: Upgrade pip (recommended)
python -m pip install --upgrade pip
Step 4: Install pdstools
pip install pdstools
For optional dependencies:
pip install 'pdstools[api]'
For Jupyter notebooks:
pip install ipykernel nbformat
Remember: Always activate your virtual environment before working with pdstools:
source .venv/bin/activate(macOS/Linux).venv\Scripts\activate(Windows)
⚠️ Warning: Installing packages globally can lead to dependency conflicts. We strongly recommend using virtual environments (see other tabs).
Step 1: Upgrade pip (recommended)
python -m pip install --upgrade pip
Step 2: Install pdstools globally
pip install pdstools
For optional dependencies:
pip install 'pdstools[api]'
For Jupyter notebooks:
pip install ipykernel nbformat
Consider using virtual environments: Global installations can cause conflicts with other Python projects. Consider switching to the “uv” or “pip + venv” methods for better project isolation.
Optional dependencies¶
We intentionally limit the number of big and heavy core dependencies. This means that while initial installation is very fast, you may at some points run into import errors and will be required to install additional dependency groups.
To install extra dependencies, you can put them in square brackets after a package name. For example, to install the optional dependencies required for using the API features of pdstools:
uv pip install 'pdstools[api]'
First activate your virtual environment, then:
pip install 'pdstools[api]'
pip install 'pdstools[api]'
For an overview of all optional dependencies and the dependency groups they will be installed for, run the following code:
from pdstools.utils.show_versions import dependency_great_table
dependency_great_table()
Python compatibility¶
Even though uv takes care of installing your python version, sometimes you have no choice of available versions. For this reason, we try to be as supportive in Python versions as we can; so our latest supported python version depends on our core dependencies, particularly Polars. As of 2024, Polars supports Python version 3.9 and higher, hence so do we.
Checking the Installation¶
With pdstools[adm] installed, you can test that it’s installed.
If you want to run code from a python notebook, install the additional packages first:
uv pip install ipykernel nbformat
First activate your virtual environment, then:
pip install ipykernel nbformat
pip install ipykernel nbformat
To create a ‘bubble chart’ on sample ADM data that plots action Success Rates vs Model Performance, similar to the one in Prediction Studio:
from pdstools import cdh_sample
cdh_sample().plot.bubble_chart()
Next Steps¶
To run these analyses over your own data, please refer to the ADMDatamart class documentation or for an example of how to use it, refer to the Example ADM Analysis.
To run the Stand-Alone Application, please refer to the Command Line Interface documentation or the ADM Health Check Article
For information on how to use the Infinity DX client, please refer to the Infinity class documentation or the Prediction Studio API Explainer article.
PDSTools supports analysis of several other Pega Data. Please see the Examples in the documentation.