Rule Classifier#
Classes#
RuleClassifier#
- class iguanas.rule_classifier.RuleClassifier[source]#
Bases:
pydantic.main.BaseModel,sklearn.base.BaseEstimator,sklearn.base.ClassifierMixinRule-based classifier that selects the single best rule.
The best rule is selected through the following steps:
Rule generation: candidate rules are extracted from XGBoost decision trees trained across a sweep of
scale_pos_weightvalues.Performance filtering: rules that fail any condition in
metric_thresholdsare discarded.Ranking: the surviving rules are sorted by
ranking_metric(descending) and the top-ranked rule is stored in_best_rule_.
- Parameters:
estimator (XGBClassifier) – XGBoost classifier used for rule generation.
scale_pos_weights (list[float] | np.ndarray, default=np.array([1.0])) – Array of scale_pos_weight values swept during rule generation.
sample_weights_df (pl.DataFrame | None, default=None) – DataFrame of sample weights used for rule generation.
ranking_metric (str, default="accuracy") – Metric used to rank candidate rules. The single highest-scoring rule is kept. Must be a column produced by compute_metrics.
metric_thresholds (list[dict[str, Any]] | None, default=None) – List of threshold dicts used to filter candidate rules. Each dict must have keys
"name"(metric column),"operator"(one of">=",">","<=","<","==","!="), and"value"(numeric threshold). All conditions are combined with AND. If None, the default threshold ofapply_and_filter_by_performanceis used.
- fit(X: polars.DataFrame, y: polars.Series) iguanas.rule_classifier.RuleClassifier[source]#
Generate, filter, and select the single best rule from training data.
- Parameters:
X (pl.DataFrame) – Feature DataFrame. Only numeric columns are used for rule generation.
y (pl.Series) – Binary target series.
- Returns:
Fitted classifier instance (self).
- Return type:
- predict(X: polars.DataFrame) polars.Series[source]#
Predict binary labels using the single best rule.
- Parameters:
X (pl.DataFrame) – Feature DataFrame with the same columns seen during fit.
- Returns:
Boolean series named “prediction”.
- Return type:
pl.Series
- predict_proba(X: polars.DataFrame) polars.Series[source]#
Predict probability using the single best rule.
Rule fires → 1.0
Rule does not fire → 0.0
- Parameters:
X (pl.DataFrame) – Feature DataFrame with the same columns seen during fit.
- Returns:
Float64 series named “proba” with values in {0.0, 1.0}.
- Return type:
pl.Series
- fit_predict(X: polars.DataFrame, y: polars.Series) polars.Series[source]#
Fit classifier and return binary predictions on the same data.
- set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') iguanas.rule_classifier.RuleClassifier#
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.- Returns:
self – The updated object.
- Return type: