pdstools.utils.report_utils¶
Attributes¶
Functions¶
|
Generate the output filename based on the report parameters. |
|
Copy the report quarto file to the temporary directory. |
|
Write parameters to YAML files for Quarto processing. |
|
Run the Quarto command to generate the report. |
|
Set the options for the Quarto command. |
|
Copy report resources from the reports directory to specified destinations. |
|
Generate a zipped archive of a directory. |
|
Get command output in an OS-agnostic way. |
|
Extract version number from version string. |
|
Get Quarto executable path and version. |
|
Get Pandoc executable path and version. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Deserialize a query that was previously serialized with serialize_query. |
Remove duplicate script tags from HTML to reduce file size. |
Module Contents¶
- logger¶
- get_output_filename(name: str | None, report_type: str, model_id: str | None = None, output_type: str = 'html') str¶
Generate the output filename based on the report parameters.
- copy_quarto_file(qmd_file: str, temp_dir: pathlib.Path) None¶
Copy the report quarto file to the temporary directory.
- Parameters:
qmd_file (str) – Name of the Quarto markdown file to copy
temp_dir (Path) – Destination directory to copy files to
- Return type:
None
- _write_params_files(temp_dir: pathlib.Path, params: Dict | None = None, project: Dict = {'type': 'default'}, analysis: Dict | None = None) None¶
Write parameters to YAML files for Quarto processing.
- Parameters:
temp_dir (Path) – Directory where YAML files will be written
params (dict, optional) – Parameters to write to params.yml, by default None
project (dict, optional) – Project configuration to write to _quarto.yml, by default {“type”: “default”}
analysis (dict, optional) – Analysis configuration to write to _quarto.yml, by default None
- Return type:
None
- run_quarto(qmd_file: str | None = None, output_filename: str | None = None, output_type: str | None = 'html', params: Dict | None = None, project: Dict = {'type': 'default'}, analysis: Dict | None = None, temp_dir: pathlib.Path = Path('.'), verbose: bool = False, *, remove_duplicate_html_scripts: bool) int¶
Run the Quarto command to generate the report.
- Parameters:
qmd_file (str, optional) – Path to the Quarto markdown file to render, by default None
output_filename (str, optional) – Name of the output file, by default None
output_type (str, optional) – Type of output format (html, pdf, etc.), by default “html”
params (dict, optional) – Parameters to pass to Quarto execution, by default None
project (dict, optional) – Project configuration settings, by default {“type”: “default”}
analysis (dict, optional) – Analysis configuration settings, by default None
temp_dir (Path, optional) – Temporary directory for processing, by default Path(“.”)
verbose (bool, optional) – Whether to print detailed execution logs, by default False
remove_duplicate_html_scripts (bool) – Whether to remove duplicate HTML script tags from the output
- Returns:
Return code from the Quarto process (0 for success)
- Return type:
- Raises:
subprocess.SubprocessError – If the Quarto command fails to execute
FileNotFoundError – If required files are not found
- _set_command_options(output_type: str | None = None, output_filename: str | None = None, execute_params: bool = False) List[str]¶
Set the options for the Quarto command.
- Parameters:
- Returns:
List of command line options for Quarto
- Return type:
List[str]
- copy_report_resources(resource_dict: list[tuple[str, str]])¶
Copy report resources from the reports directory to specified destinations.
- generate_zipped_report(output_filename: str, folder_to_zip: str)¶
Generate a zipped archive of a directory.
This is a general-purpose utility function that can compress any directory into a zip archive. While named for report generation, it works with any directory structure.
- Parameters:
- Return type:
None
- Raises:
FileNotFoundError – If the folder to zip does not exist or is not a directory
Examples
>>> generate_zipped_report("my_archive.zip", "/path/to/directory") >>> generate_zipped_report("report_2023", "/tmp/report_output")
- get_quarto_with_version(verbose: bool = True) Tuple[pathlib.Path, str]¶
Get Quarto executable path and version.
- Parameters:
verbose (bool)
- Return type:
Tuple[pathlib.Path, str]
- get_pandoc_with_version(verbose: bool = True) Tuple[pathlib.Path, str]¶
Get Pandoc executable path and version.
- Parameters:
verbose (bool)
- Return type:
Tuple[pathlib.Path, str]
- quarto_print(text)¶
- quarto_callout_info(info)¶
- quarto_callout_important(info)¶
- quarto_callout_no_prediction_data_warning(extra='')¶
- quarto_callout_no_predictor_data_warning(extra='')¶
- polars_col_exists(df, col)¶
- polars_subset_to_existing_cols(all_columns, cols)¶
- table_standard_formatting(source_table, title=None, subtitle=None, rowname_col=None, groupname_col=None, cdh_guidelines=CDHGuidelines(), highlight_limits: Dict[str, str | List[str]] = {}, highlight_lists: Dict[str, List[str]] = {}, highlight_configurations: List[str] = [], rag_styler: Callable = rag_background_styler)¶
- table_style_predictor_count(gt, flds, cdh_guidelines=CDHGuidelines(), rag_styler=rag_textcolor_styler)¶
- n_unique_values(dm, all_dm_cols, fld)¶
- max_by_hierarchy(dm, all_dm_cols, fld, grouping)¶
- avg_by_hierarchy(dm, all_dm_cols, fld, grouping)¶
- sample_values(dm, all_dm_cols, fld, n=6)¶
- serialize_query(query: pdstools.utils.types.QUERY | None) Dict | None¶
- Parameters:
query (Optional[pdstools.utils.types.QUERY])
- Return type:
Optional[Dict]
- deserialize_query(serialized_query: Dict | None) pdstools.utils.types.QUERY | None¶
Deserialize a query that was previously serialized with serialize_query.
- Parameters:
serialized_query (Optional[Dict]) – A serialized query dictionary created by serialize_query
- Returns:
The deserialized query
- Return type:
Optional[QUERY]
- remove_duplicate_html_scripts(html_content: str, verbose: bool = False) str¶
Remove duplicate script tags from HTML to reduce file size.
Specifically targets large JavaScript libraries (like Plotly.js) that get embedded multiple times in HTML reports, while preserving all unique plot data and initialization scripts.