Metrics#

Functions#

compute_single_metric#

iguanas.metrics.compute_single_metric(y_pred: polars.Series, y: polars.Series, metric: str, weights: polars.Series | None = None) → float[source]#

Compute a single performance metric for one boolean prediction series.

Faster than compute_metrics when only one scalar is needed, because it skips computing all 25+ derived metrics. Used internally by combine_rules_beam_search during candidate evaluation.

Parameters:

y_pred (pl.Series) – Boolean prediction series.
y (pl.Series) – Boolean target series.
metric (str) – Metric name: “precision”, “recall”, “accuracy”, or an F-beta score (f<number>).
weights (pl.Series | None, default=None) – Optional sample weights. When provided, all counts use weighted sums.

Returns:

The requested metric value.

Return type:

float

compute_metrics#

iguanas.metrics.compute_metrics(R: polars.Series | polars.DataFrame, y: polars.Series, weights: polars.Series | None = None, betas: list[float] | None = None) → polars.DataFrame[source]#

Compute comprehensive performance metrics for all rule columns.

Calculates confusion matrix, precision, recall, F-beta scores, and TPVE metrics for each rule. Optionally computes weighted versions of all metrics.

Parameters:

R (pl.DataFrame) – DataFrame with boolean columns representing rule predictions. Each column is a rule that evaluates to True/False for each observation.
y (pl.Series) – Boolean target series indicating true labels (True for positive class). Will be cast to Boolean if not already.
weights (pl.Series | None, default=None) – Optional numeric series for weighted metrics computation. If provided, computes both count-based and weighted versions of all metrics.
betas (list[float], default=[0.25, 0.5, 1, 1.5, 2]) – F-beta values to compute. Each value b produces a column named f{b} (and f{b}_weight when weights is provided).

Returns:

DataFrame with one row per rule containing:

rule: Rule name (column name from R)
TP, FP, TN, FN: Confusion matrix counts
precision, recall, accuracy: Standard classification metrics
flagged(%): Percentage of total flagged as positive
good_flagged(%): Percentage of negatives flagged as positive
f{b} for each b in betas: F-beta scores
num_rules: Number of individual rules y_pred (1 for single rules)

If weights is provided, additional columns with “_weight” suffix:

TP_weight, FP_weight, TN_weight, FN_weight: Weighted confusion matrix
total_weight, precision_weight, recall_weight, accuracy_weight: Weighted versions
f{b}_weight for each b in betas: Weighted F-beta scores

Return type:

pl.DataFrame

Examples

>>> import polars as pl
>>> # Count-based metrics only
>>> metrics_df = compute_metrics(R, y, weights=None)
>>>
>>> # Both count and weighted metrics
>>> metrics_df = compute_metrics(R, y, weights=transaction_amounts)
>>>
>>> # Sort by TPVE3 to find best rules
>>> top_rules = metrics_df.sort("TPVE3", descending=True).head(10)

Examples:

import polars as pl
from iguanas.metrics import compute_metrics

# Count-based metrics only
metrics_X = compute_metrics(R, y, weights=None)

# Both count and weighted metrics
metrics_X = compute_metrics(R, y, weights=transaction_amounts)

# Sort by TPVE3 to find best rules
top_rules = metrics_X.sort("TPVE3", descending=True).head(10)