Command Line Interface¶
Command line utility to run pdstools apps.
usage: pdstools [-h] [--version] [--data-path DATA_PATH] [--sample SAMPLE]
[--filter FILTER] [--temp-dir TEMP_DIR]
[{health_check,decision_analyzer,impact_analyzer,hc,da,ia}]
Positional Arguments¶
- app
Possible choices: health_check, decision_analyzer, impact_analyzer, hc, da, ia
The app to run: “health_check” (alias: hc) | “decision_analyzer” (alias: da) | “impact_analyzer” (alias: ia)
Named Arguments¶
- --version
show program’s version number and exit
- --data-path
Path to a data file or directory to load on startup. Supports parquet, csv, json, arrow, zip, and partitioned folders. Exposed to the app as the PDSTOOLS_DATA_PATH env var.
- --sample
Pre-ingestion interaction sampling for large datasets. Specify an absolute count (e.g. ‘100000’, ‘100k’, ‘1M’) or a percentage (e.g. ‘10%’). All rows for each sampled interaction are kept. Exposed to the app as the PDSTOOLS_SAMPLE_LIMIT env var. To sample programmatically without the app, see pdstools.decision_analyzer.utils.sample_interactions() and pdstools.decision_analyzer.utils.prepare_and_save().
- --filter
Pre-ingestion row filter for extracting specific data from large files. Syntax options: ‘Column=value1,value2,…’ (categorical, exact match), ‘Column>=N’ / ‘Column<=N’ / ‘Column>N’ / ‘Column<N’ (numeric), ‘Column=YYYY-MM-DD..YYYY-MM-DD’ (date range, inclusive). Column names use display names (e.g. ‘Channel’, ‘Decision Time’, ‘ModelPositives’). Multiple –filter flags are ANDed together. Can be combined with –sample (filter is applied first).
- --temp-dir
Directory for temporary files such as the sampled data parquet. Defaults to the current working directory. Exposed to the app as the PDSTOOLS_TEMP_DIR env var.