Rule Analysis#

Functions#

parse_conditions#

iguanas.rule_analysis.parse_conditions(expr: str) → dict[source]#

Parse a boolean expression string into a nested dict tree.

Parameters:: expr (str) – Boolean rule expression using & (AND) and | (OR) operators, e.g. '(X["a"] > 1) & (X["b"] < 5)'.
Returns:: Nested dict with keys "op" ("&" or "|"), "left", and "right". Leaf nodes are plain strings.
Return type:: dict

parse_levels#

iguanas.rule_analysis.parse_levels(expr: str) → list[dict][source]#

Parse a boolean expression level by level using BFS.

Assigns a hierarchical dot-notation index to each sub-expression so the original expression can be rebuilt bottom-up.

Parameters:: expr (str) – Boolean rule expression using & (AND) and | (OR) operators.
Returns:: BFS-ordered list of level entries. Each entry is a dict with a single key (the operator "&" or "|"), whose value is a list of (index, sub_expr) tuples. Indices use dot notation reflecting position in the tree (e.g. "1.0" = first child of the item indexed "1" in the parent level).
Return type:: list[dict]

Examples

>>> parse_levels('(A > 1) | ((B <= 5) & (C < 3)) | (D >= 0)')
[
    {'|': [('0', '(A > 1)'), ('1', '(B <= 5) & (C < 3)'), ('2', '(D >= 0)')]},
    {'&': [('1.0', '(B <= 5)'), ('1.1', '(C < 3)')]},
]

rebuild_from_levels#

iguanas.rule_analysis.rebuild_from_levels(levels: list[dict]) → str[source]#

Rebuild the original boolean expression from parse_levels output.

Processes levels bottom-up: the deepest compound sub-expressions are collapsed first, then their rebuilt strings replace the placeholder in the parent level.

Parameters:: levels (list[dict]) – Output of parse_levels().
Returns:: Reconstructed boolean expression string.
Return type:: str

generate_rule_performance_report#

iguanas.rule_analysis.generate_rule_performance_report(rules: str | list[str], X: polars.DataFrame, y: polars.Series, weights: polars.Series | None = None) → polars.DataFrame[source]#

For each rule in rules, parses it into its components (BFS levels), evaluates every component on X, computes metrics, and returns a DataFrame with one row per component across all rules.

The rule_index column uses dot notation with the rule’s position in the list prepended as the root level, e.g. for the 2nd rule: "2.0", "2.1", "2.1.0", "2.1.1", …

Parameters:

rules (str | list[str]) – List of boolean rule expression strings (using & / | operators). A single string is also accepted and treated as a one-element list.
X (pl.DataFrame) – Feature DataFrame on which to evaluate each component.
y (pl.Series) – Boolean target series.
weights (pl.Series | None, default=None) – Optional sample weights passed to compute_metrics.

Returns:

One row per component with columns: rule_index, rule, + all metric columns from compute_metrics.

Return type:

pl.DataFrame