plaid.pipelines.sklearn_block_wrappers

Wrapped scikit-learn transformers and regressors for PLAID Dataset compatibility.

Provides adapters to use scikit-learn estimators within the PLAID feature/block system:

  • WrappedPlaidSklearnTransformer: wraps a TransformerMixin

  • WrappedPlaidSklearnRegressor: wraps a RegressorMixin

Attributes

Classes

WrappedSklearnTransformer

Adapter for using a scikit-learn transformer on PLAID Datasets.

WrappedSklearnRegressor

Adapter for using a scikit-learn regressor with PLAID Dataset.

Functions

get_2Darray_from_homogeneous_identifiers(...)

Returns a 2D array from a Dataset and a feature id.

Module Contents

Self[source]
get_2Darray_from_homogeneous_identifiers(dataset: plaid.Dataset, features_identifiers: list[plaid.containers.FeatureIdentifier]) plaid.types.Array[source]

Returns a 2D array from a Dataset and a feature id.

The function calls dataset.get_tabular_from_homogeneous_identifiers(…), then removes either the second or third dimension if it has size 1, so that the output is 2D.

Parameters:
  • dataset (Dataset) – A Dataset object exposing get_tabular_from_homogeneous_identifiers.

  • features_identifiers (list[FeatureIdentifier]) – a list of input feature identifiers.

Returns:

A NumPy array of shape (n_samples, n_features).

Raises:
  • AssertionError – If the number of features in the output does not match the identifiers.

  • ValueError – If both the second and third dimensions have size greater than 1.

class WrappedSklearnTransformer(sklearn_block: plaid.types.SklearnBlock, in_features_identifiers: list[plaid.containers.FeatureIdentifier], out_features_identifiers: list[plaid.containers.FeatureIdentifier] | None = None)[source]

Bases: sklearn.base.TransformerMixin, sklearn.base.BaseEstimator

Adapter for using a scikit-learn transformer on PLAID Datasets.

Transforms tabular data extracted from homogeneous feature identifiers, and returns results as a Dataset. Supports forward and inverse transforms.

Parameters:
  • sklearn_block (SklearnBlock) – A scikit-learn Transformer implementing fit/transform APIs.

  • in_features_identifiers (list[FeatureIdentifier]) – List of feature identifiers to extract input data from.

  • out_features_identifiers (list[FeatureIdentifier], optional) – List of feature identifiers used for outputs. If None, defaults to in_features_identifiers.

sklearn_block[source]
in_features_identifiers[source]
out_features_identifiers = None[source]
fit(dataset: plaid.Dataset, _y=None) Self[source]

Fits the underlying scikit-learn transformer on selected input features.

Parameters:
  • dataset – A Dataset object containing the features to transform.

  • _y – Ignored.

Returns:

The fitted transformer.

Return type:

self

transform(dataset: plaid.Dataset) plaid.Dataset[source]

Applies the fitted transformer to the selected input features.

Parameters:

dataset – A Dataset object to transform.

Returns:

Transformed features wrapped as a new Dataset.

Return type:

Dataset

inverse_transform(dataset: plaid.Dataset) plaid.Dataset[source]

Applies inverse transformation to the output features.

Parameters:

dataset – A Dataset object with transformed output features.

Returns:

Dataset with inverse-transformed features.

Return type:

Dataset

class WrappedSklearnRegressor(sklearn_block: plaid.types.SklearnBlock, in_features_identifiers: list[plaid.containers.FeatureIdentifier], out_features_identifiers: list[plaid.containers.FeatureIdentifier])[source]

Bases: sklearn.base.RegressorMixin, sklearn.base.BaseEstimator

Adapter for using a scikit-learn regressor with PLAID Dataset.

Fits and predicts on tabular arrays extracted from stacked features, while preserving the feature/block structure expected by PLAID.

Parameters:
  • sklearn_block – A scikit-learn regressor with fit/predict API.

  • in_features_identifiers – List of feature identifiers for inputs.

  • out_features_identifiers – List of feature identifiers for outputs.

sklearn_block[source]
in_features_identifiers[source]
out_features_identifiers[source]
fit(dataset: plaid.Dataset, _y=None) Self[source]

Fits the wrapped scikit-learn regressor on the stacked input/output data.

Parameters:
  • dataset – A Dataset containing both input and output features.

  • _y – Ignored.

Returns:

The fitted regressor.

Return type:

self

predict(dataset: plaid.Dataset) plaid.Dataset[source]

Predicts target values using the fitted regressor.

Parameters:

dataset – A Dataset with input features.

Returns:

A new Dataset containing predicted target features.

Return type:

Dataset