plaid.pipelines.sklearn_block_wrappers¶
Wrapped scikit-learn transformers and regressors for PLAID Dataset compatibility.
Provides adapters to use scikit-learn estimators within the PLAID feature/block system:
WrappedPlaidSklearnTransformer: wraps a TransformerMixin
WrappedPlaidSklearnRegressor: wraps a RegressorMixin
Attributes¶
Classes¶
Adapter for using a scikit-learn transformer on PLAID Datasets. |
|
Adapter for using a scikit-learn regressor with PLAID Dataset. |
Functions¶
Returns a 2D array from a Dataset and a feature id. |
Module Contents¶
- get_2Darray_from_homogeneous_identifiers(dataset: plaid.Dataset, features_identifiers: list[plaid.containers.FeatureIdentifier]) plaid.types.Array[source]¶
Returns a 2D array from a Dataset and a feature id.
The function calls dataset.get_tabular_from_homogeneous_identifiers(…), then removes either the second or third dimension if it has size 1, so that the output is 2D.
- Parameters:
dataset (Dataset) – A Dataset object exposing get_tabular_from_homogeneous_identifiers.
features_identifiers (list[FeatureIdentifier]) – a list of input feature identifiers.
- Returns:
A NumPy array of shape (n_samples, n_features).
- Raises:
AssertionError – If the number of features in the output does not match the identifiers.
ValueError – If both the second and third dimensions have size greater than 1.
- class WrappedSklearnTransformer(sklearn_block: plaid.types.SklearnBlock, in_features_identifiers: list[plaid.containers.FeatureIdentifier], out_features_identifiers: list[plaid.containers.FeatureIdentifier] | None = None)[source]¶
Bases:
sklearn.base.TransformerMixin,sklearn.base.BaseEstimatorAdapter for using a scikit-learn transformer on PLAID Datasets.
Transforms tabular data extracted from homogeneous feature identifiers, and returns results as a Dataset. Supports forward and inverse transforms.
- Parameters:
sklearn_block (SklearnBlock) – A scikit-learn Transformer implementing fit/transform APIs.
in_features_identifiers (list[FeatureIdentifier]) – List of feature identifiers to extract input data from.
out_features_identifiers (list[FeatureIdentifier], optional) – List of feature identifiers used for outputs. If None, defaults to in_features_identifiers.
- fit(dataset: plaid.Dataset, _y=None) Self[source]¶
Fits the underlying scikit-learn transformer on selected input features.
- Parameters:
dataset – A Dataset object containing the features to transform.
_y – Ignored.
- Returns:
The fitted transformer.
- Return type:
self
- class WrappedSklearnRegressor(sklearn_block: plaid.types.SklearnBlock, in_features_identifiers: list[plaid.containers.FeatureIdentifier], out_features_identifiers: list[plaid.containers.FeatureIdentifier])[source]¶
Bases:
sklearn.base.RegressorMixin,sklearn.base.BaseEstimatorAdapter for using a scikit-learn regressor with PLAID Dataset.
Fits and predicts on tabular arrays extracted from stacked features, while preserving the feature/block structure expected by PLAID.
- Parameters:
sklearn_block – A scikit-learn regressor with fit/predict API.
in_features_identifiers – List of feature identifiers for inputs.
out_features_identifiers – List of feature identifiers for outputs.