Initializing a Dataset with Tabular Data¶
Initializing a Dataset with Tabular Data:
Generate random tabular data for multiple scalars.
Initialize a dataset with the tabular data.
Accessing and Manipulating Data in the Dataset:
Retrieve and print the dataset and specific samples.
Access and display the value of a particular scalar within a sample.
Retrieve tabular data from the dataset based on scalar names.
This example demonstrates how to initialize a dataset with tabular data, access specific samples, retrieve scalar values, and extract tabular data based on scalar names.
# Import required libraries
import numpy as np
# Import necessary libraries and functions
from plaid.utils.init_with_tabular import initialize_dataset_with_tabular_data
# Print dict util
def dprint(name: str, dictio: dict):
print(name, "{")
for key, value in dictio.items():
print(" ", key, ":", value)
print("}")
Section 1: Initializing a Dataset with Tabular Data¶
# Generate random tabular data for multiple scalars
nb_scalars = 7
nb_samples = 10
names = [f"scalar_{j}" for j in range(nb_scalars)]
tabular_data = {}
for name in names:
tabular_data[name] = np.random.randn(nb_samples)
dprint("tabular_data", tabular_data)
tabular_data {
scalar_0 : [-1.10227525 -0.65066961 0.37756 1.5489911 -0.47547023 -0.56480219
0.55999345 1.28572067 -0.95152752 2.16073146]
scalar_1 : [ 0.70992886 2.18653856 -1.51817988 0.93210383 -0.50735673 -0.56203609
0.58955331 -0.13386108 0.02964018 -0.18856828]
scalar_2 : [-0.00268882 -1.1557336 -0.63222735 1.30355936 -2.32795344 -0.67725776
1.47079743 -0.37335718 -0.51192814 -0.94012624]
scalar_3 : [-0.08623664 -1.8070916 -0.38619911 0.46730332 0.71152495 -0.1066194
-2.06537501 -0.07143722 -1.01676683 0.26100002]
scalar_4 : [-0.54031942 1.31588041 -0.77219106 -0.00913094 0.18666597 0.27180463
0.99621116 -0.23592747 -1.05006049 -0.61395605]
scalar_5 : [-1.39250922 1.19145457 1.33696053 1.52522562 -0.73688856 -0.57967617
-0.22313158 -0.90367699 1.93874027 1.1432539 ]
scalar_6 : [ 0.44367466 0.82612495 0.19298794 0.03554421 0.25240948 -0.04781558
-0.25706572 1.96215118 1.15423612 0.04512218]
}
# Initialize a dataset with the tabular data
dataset = initialize_dataset_with_tabular_data(tabular_data)
print("Initialized Dataset: ", dataset)
Initialized Dataset: Dataset(10 samples, 7 scalars, 0 time_series, 0 fields)
Section 2: Accessing and Manipulating Data in the Dataset¶
# Retrieve and print the dataset and specific samples
sample_1 = dataset[1]
print(f"{sample_1 = }")
sample_1 = Sample(path=None, meshes=<plaid.containers.features.meshes.SampleMeshes object at 0x7618b2df1350>, scalars=<plaid.containers.features.scalars.SampleScalars object at 0x7618b1ef3fd0>, time_series=None)
# Access and display the value of a particular scalar within a sample
scalar_value = sample_1.get_scalar("scalar_0")
print("Scalar 'scalar_0' in Sample 1:", scalar_value)
Scalar 'scalar_0' in Sample 1: -0.6506696058107521
# Retrieve tabular data from the dataset based on scalar names
scalar_names = ["scalar_1", "scalar_3", "scalar_5"]
tabular_data_subset = dataset.get_scalars_to_tabular(scalar_names)
print("Tabular Data Subset for Scalars 1, 3, and 5:")
dprint("tabular_data_subset", tabular_data_subset)
Tabular Data Subset for Scalars 1, 3, and 5:
tabular_data_subset {
scalar_1 : [ 0.70992886 2.18653856 -1.51817988 0.93210383 -0.50735673 -0.56203609
0.58955331 -0.13386108 0.02964018 -0.18856828]
scalar_3 : [-0.08623664 -1.8070916 -0.38619911 0.46730332 0.71152495 -0.1066194
-2.06537501 -0.07143722 -1.01676683 0.26100002]
scalar_5 : [-1.39250922 1.19145457 1.33696053 1.52522562 -0.73688856 -0.57967617
-0.22313158 -0.90367699 1.93874027 1.1432539 ]
}