Data Cleaning

These transformers can be used to reduce the number of columns during the feature selection step.

Base data_cleaning transformer

_BaseDataCleaning

Base data cleaning transformer.

Off-line data cleaning

DropHighCardinality

Drop the categorical columns having a large cardinality.

DropHighNaNRatio

Drop the columns having a large NaN values ratio.

DropLowCardinality

Drop the categorical columns having a low cardinality.

Realtime data cleaning

ConvertColumnDatatype

Set the datatype of the selected columns to a given datatype.

DropColumns

Drop the columns given by the user.

DropDatatypeColumns

Drop the columns belonging to a given datatype.

KeepColumns

Drop the columns which are not given by the user.

Replace

Replace the categorical values by the ones given by the user.