gators.converter.ToNumpy

class gators.converter.ToNumpy[source]

Convert dataframe and series to NumPy arrays.

Examples

Imports and initialization:

>>> from gators.converter import ToNumpy
>>> obj = ToNumpy()

The fit, transform, and fit_transform methods accept:

  • dask dataframes:

>>> import dask.dataframe as dd
>>> import pandas as pd
>>> X = dd.from_pandas(pd.DataFrame({
... 'A': [0.0, 3.0, 6.0],
... 'B': [1.0, 4.0, 7.0],
... 'C': [2.0, 5.0, 8.0]}), npartitions=1)
>>> y = dd.from_pandas(pd.Series([0, 0, 1], name='TARGET'), npartitions=1)
  • koalas dataframes:

>>> import databricks.koalas as ks
>>> X = ks.DataFrame({
... 'A': [0.0, 3.0, 6.0],
... 'B': [1.0, 4.0, 7.0],
... 'C': [2.0, 5.0, 8.0]})
>>> y = ks.Series([0, 0, 1], name='TARGET')
  • and pandas dataframes:

>>> import pandas as pd
>>> X = pd.DataFrame({
... 'A': [0.0, 3.0, 6.0],
... 'B': [1.0, 4.0, 7.0],
... 'C': [2.0, 5.0, 8.0]})
>>> y = pd.Series([0, 0, 1], name='TARGET')

The result is a 2D NumPy array for X and a 1D NumPy array for y.

>>> X, y = obj.transform(X, y)
>>> X
array([[0., 1., 2.],
       [3., 4., 5.],
       [6., 7., 8.]])
>>> y
array([0, 0, 1])
transform(X: Union[pd.DataFrame, ks.DataFrame, dd.DataFrame], y: Union[pd.Series, ks.Series, dd.Series]) → Tuple[numpy.ndarray, numpy.ndarray][source]

Fit the transformer on the dataframe X.

Parameters
XDataFrame.

Dataframe.

y[pd.Series, ks.Series]:

Target values.

Returns
Xnp.ndarray

Array.

ynp.ndarray

Target values.

static check_dataframe(X: Union[pd.DataFrame, ks.DataFrame, dd.DataFrame])

Validate dataframe.

Parameters
XDataFrame

Input dataframe.

static check_target(X: Union[pd.DataFrame, ks.DataFrame, dd.DataFrame], y: Union[pd.Series, ks.Series, dd.Series])

Validate target.

Parameters
XDataFrame

Dataframe.

ySeries

Target values.