pdstools.decision_analyzer.data_read_utils¶
Functions¶
|
Reads a zip file buffer (uploaded from Streamlit) that contains .zip files, |
|
Reads gzipped ndjson data from a BytesIO object and returns a Polars DataFrame. |
|
Iterates over all files with a .zip extension in the given directory, treats them |
|
|
|
Module Contents¶
- read_nested_zip_files(file_buffer) polars.DataFrame ¶
Reads a zip file buffer (uploaded from Streamlit) that contains .zip files, which are in fact gzipped ndjson files. Extracts, reads, and concatenates them into a single Polars DataFrame.
- Parameters:
file_buffer (UploadedFile) – The uploaded zip file buffer from Streamlit.
- Returns:
A concatenated Polars DataFrame containing the data from all gzipped ndjson files.
- Return type:
pl.DataFrame
- read_gzipped_data(data: io.BytesIO) polars.DataFrame | None ¶
Reads gzipped ndjson data from a BytesIO object and returns a Polars DataFrame.
- Parameters:
data (BytesIO) – The gzipped ndjson data.
- Returns:
The Polars DataFrame containing the data, or None if reading fails.
- Return type:
Optional[pl.DataFrame]
- read_gzips_with_zip_extension(path: str) polars.DataFrame ¶
Iterates over all files with a .zip extension in the given directory, treats them as gzipped ndjson files, reads, and concatenates them into a single Polars DataFrame.
- Parameters:
path (str) – The path to the directory containing the .zip files.
- Returns:
A concatenated Polars DataFrame containing the data from all gzipped ndjson files.
- Return type:
pl.DataFrame
- read_data(path)¶
- get_da_data_path()¶
- validate_columns(df: polars.LazyFrame, extract_type: Dict[str, pdstools.decision_analyzer.table_definition.TableConfig])¶
- Parameters:
df (polars.LazyFrame)
extract_type (Dict[str, pdstools.decision_analyzer.table_definition.TableConfig])