wetsuit.transforms module

Transformers module.

class wetsuit.transforms.H2oFrameTransformer(features: List[Union[str, int]], response: Union[str, int])[source]

Bases: BaseEstimator, TransformerMixin

Transformer class for H2OFrames.

Instantiate transformer.

Parameters:
  • features (List[Union[str, int]]) – A list of column names or indices indicating the predictor variables.

  • response (Union[str, int]) – A column name or index indicating the response variable.

__init__(features: List[Union[str, int]], response: Union[str, int])[source]

Instantiate transformer.

Parameters:
  • features (List[Union[str, int]]) – A list of column names or indices indicating the predictor variables.

  • response (Union[str, int]) – A column name or index indicating the response variable.

fit(X, y) H2oFrameTransformer[source]

Fit transformer to create H2OFrames.

Parameters:
  • X (Array-like of shape [n_samples, n_features]) – The input samples.

  • y (Array-like of shape (n_samples,) or (n_samples, n_outputs)) – Target values (None for unsupervised transformations).

Returns:

H2oFrameTransformer

transform(X, y) Tuple[H2OFrame, H2OFrame][source]

Get transformed H2OFrames.

Parameters:
  • X (Array-like of shape [n_samples, n_features]) – The input samples.

  • y (Array-like of shape (n_samples,) or (n_samples, n_outputs)) – Target values (None for unsupervised transformations).

Returns:

Tuple[h2o.H2OFrame, h2o.H2OFrame] – A tuple of X, y each represented as an H2OFrame.

fit_transform(X, y=None, **fit_params) Tuple[H2OFrame, H2OFrame][source]

Get transformed H2OFrames.

Parameters:
  • X (Array-like of shape [n_samples, n_features]) – The input samples.

  • y (Array-like of shape (n_samples,) or (n_samples, n_outputs)) – Target values (None for unsupervised transformations).

Returns:

Tuple[h2o.H2OFrame, h2o.H2OFrame] – A tuple of X, y each represented as an H2OFrame.

inverse_transform(X: H2OFrame, y: H2OFrame) Tuple[DataFrame, DataFrame][source]

Convert H2OFrames back to pandas DataFrames.

Parameters:
  • X (h2o.H2OFrame) – H2OFrame representation of original X data.

  • y (h2o.H2OFrame) – H2OFrame representation of original y data.

Returns:

Tuple[pd.DataFrame, pd.DataFrame] – A tuple of X, y each represented as an pandas DataFrame.

get_params(deep=True)

Get parameters for this estimator.

Parameters:

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

params (dict) – Parameter names mapped to their values.

set_output(*, transform=None)

Set output container.

See sphx_glr_auto_examples_miscellaneous_plot_set_output.py for an example on how to use the API.

Parameters:

transform ({“default”, “pandas”}, default=None) – Configure output of transform and fit_transform.

  • “default”: Default output format of a transformer

  • “pandas”: DataFrame output

  • None: Transform configuration is unchanged

Returns:

self (estimator instance) – Estimator instance.

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:

**params (dict) – Estimator parameters.

Returns:

self (estimator instance) – Estimator instance.