plaid.pipelines.plaid_blocks¶
Custom meta-estimators for applying feature-wise and target-wise transformations.
Includes:
PlaidTransformedTargetRegressor: transforms the target before fitting.
PlaidColumnTransformer: applies transformers to feature subsets like ColumnTransformer.
Attributes¶
Classes¶
Custom column-wise transformer for PLAID-style datasets. |
|
Meta-estimator that transforms the target before fit and inverses it at predict. |
Module Contents¶
- class ColumnTransformer(plaid_transformers: list[tuple[str, sklearn.base.TransformerMixin | sklearn.pipeline.Pipeline]])[source]¶
Bases:
sklearn.compose.ColumnTransformerCustom column-wise transformer for PLAID-style datasets.
Similar to scikit-learn’s ColumnTransformer, this class applies a list of transformer blocks to subsets of features, defined by their feature identifiers. Additionally, it preserves a set of remainder features that bypass transformation.
- Parameters:
plaid_transformers – A list of tuples (name, transformer), where each transformer is a TransformerMixin.
Note
At fit, it is checked that plaid_transformers share no in_features_identifiers and no out_features_identifiers.
- fit(dataset: plaid.Dataset, _y=None) Self[source]¶
Fits all transformers on their corresponding feature subsets.
- Parameters:
dataset – A Dataset object or a list of samples.
y – Ignored. Present for API compatibility.
- Returns:
The fitted PlaidColumnTransformer.
- Return type:
self
- transform(dataset: plaid.Dataset) plaid.Dataset[source]¶
Applies fitted transformers to feature subsets and merges results.
- Parameters:
dataset – A Dataset object or a list of samples.
- Returns:
A new Dataset with transformed feature blocks, including untransformed remainder features.
- Return type:
- fit_transform(dataset: plaid.Dataset, y=None) plaid.Dataset[source]¶
Fits all transformers and returns the combined transformed dataset.
- Parameters:
dataset – A Dataset object or a list of samples.
y – Ignored. Present for API compatibility.
- Returns:
A new Dataset with transformed features.
- Return type:
- inverse_transform(dataset: plaid.Dataset) plaid.Dataset[source]¶
Applies fitted inverse transformers to feature subsets and merges results.
- Parameters:
dataset – A Dataset object or a list of samples.
- Returns:
A new Dataset with inverse transformed feature blocks, including untransformed remainder features.
- Return type:
- class TransformedTargetRegressor(regressor: sklearn.base.RegressorMixin | sklearn.pipeline.Pipeline, transformer: sklearn.base.TransformerMixin | sklearn.pipeline.Pipeline)[source]¶
Bases:
sklearn.base.RegressorMixin,sklearn.base.BaseEstimatorMeta-estimator that transforms the target before fit and inverses it at predict.
This regressor is compatible with custom Dataset objects and supports complex targets, including scalars and fields. It wraps a base regressor and a transformer that is responsible for preprocessing the target space.
- Parameters:
regressor – A regressor implementing fit and predict, following the scikit-learn API.
transformer – A transformer implementing fit, transform, and inverse_transform. Applied to the dataset before fitting the regressor.
- fit(dataset: plaid.Dataset, _y=None) Self[source]¶
Fits the transformer and the regressor on the transformed dataset.
- Parameters:
dataset – A Dataset object or a list of sample dictionaries. Input training data.
y – Ignored. Present for API compatibility.
- Returns:
The fitted estimator.
- Return type:
self
- predict(dataset: plaid.Dataset) plaid.Dataset[source]¶
Predicts target values using the fitted regressor, then applies the inverse transformation.
- Parameters:
dataset – A Dataset object or a list of sample dictionaries. Input data to predict on.
- Returns:
A Dataset containing the inverse-transformed predictions.
- Return type:
- score(dataset_X: plaid.Dataset, dataset_y: plaid.Dataset = None) float[source]¶
Computes a normalized root mean squared error (RMSE) score on the transformed targets.
The score is defined as 1 - avg(relative RMSE) over all target features in the transformer input features identifiers. The error computation depends on the feature type: - For “scalar” features: RMSE normalized by squared reference value. - For “field” features: RMSE normalized by field size and max-norm of the reference.
- Parameters:
dataset_X – A Dataset object or a list of samples. Input features used for prediction.
dataset_y – A Dataset object or list, optional. Ground-truth targets. If None, dataset_X is used for both input and reference.
- Returns:
A score between -inf and 1. A perfect prediction yields a score of 1.0.
- Return type:
- Raises:
ValueError – If an unknown feature type is encountered.