pdstools.utils.cdh_utils._namespacing ===================================== .. py:module:: pdstools.utils.cdh_utils._namespacing .. autoapi-nested-parse:: Pega-style field-name normalisation and predictor categorisation. Functions --------- .. autoapisummary:: pdstools.utils.cdh_utils._namespacing._capitalize pdstools.utils.cdh_utils._namespacing.default_predictor_categorization Module Contents --------------- .. py:function:: _capitalize(fields: str | collections.abc.Iterable[str], extra_endwords: collections.abc.Iterable[str] | None = None) -> list[str] Applies automatic capitalization, aligned with the R counterpart. :param fields: A list of names :type fields: list :returns: **fields** -- The input list, but each value properly capitalized :rtype: list .. rubric:: Notes The capitalize_endwords list contains atomic word parts that are commonly found in Pega field names. Compound words (like "ResponseCount") don't need to be listed separately because the algorithm processes words by length, allowing shorter components ("Response", "Count") to handle them. .. py:function:: default_predictor_categorization(x: str | polars.Expr = pl.col('PredictorName')) -> polars.Expr Function to determine the 'category' of a predictor. It is possible to supply a custom function. This function can accept an optional column as input And as output should be a Polars expression. The most straight-forward way to implement this is with pl.when().then().otherwise(), which you can chain. By default, this function returns "Primary" whenever there is no '.' anywhere in the name string, otherwise returns the first string before the first period :param x: The column to parse :type x: str | pl.Expr, default = pl.col('PredictorName')