Initializing a Dataset with Tabular Data¶
Initializing a Dataset with Tabular Data:
Generate random tabular data for multiple scalars.
Initialize a dataset with the tabular data.
Accessing and Manipulating Data in the Dataset:
Retrieve and print the dataset and specific samples.
Access and display the value of a particular scalar within a sample.
Retrieve tabular data from the dataset based on scalar names.
This example demonstrates how to initialize a dataset with tabular data, access specific samples, retrieve scalar values, and extract tabular data based on scalar names.
# Import required libraries
import numpy as np
# Import necessary libraries and functions
from plaid.utils.init_with_tabular import initialize_dataset_with_tabular_data
# Print dict util
def dprint(name: str, dictio: dict):
print(name, "{")
for key, value in dictio.items():
print(" ", key, ":", value)
print("}")
Section 1: Initializing a Dataset with Tabular Data¶
# Generate random tabular data for multiple scalars
nb_scalars = 7
nb_samples = 10
names = [f"scalar_{j}" for j in range(nb_scalars)]
tabular_data = {}
for name in names:
tabular_data[name] = np.random.randn(nb_samples)
dprint("tabular_data", tabular_data)
tabular_data {
scalar_0 : [-0.7748112 0.58197174 -0.93305148 0.00520075 -0.43612163 -1.38130933
0.6327033 0.64911131 0.74531583 -0.53086264]
scalar_1 : [-1.14554866 -1.82872028 0.74539072 0.72077254 -1.90526983 -1.01022042
-0.57791925 -0.15221116 -1.78209615 1.28911555]
scalar_2 : [-0.24837586 -0.35781295 -0.55286984 -0.1958495 1.14654056 0.19179581
-0.65507729 -0.43993229 -1.59230989 0.51258886]
scalar_3 : [ 0.0250144 -0.30140477 1.08292446 2.13233325 0.05831823 -1.1326039
-0.11535814 1.1376475 2.97777911 -2.82593091]
scalar_4 : [-0.5426671 0.74681293 1.41993807 0.39546077 -0.33254456 -0.4891201
0.17259896 0.38166388 0.36033517 -0.27012217]
scalar_5 : [-1.46104057 0.13417581 -0.52833346 -0.23916237 0.7163114 -1.0312232
0.92713797 2.05439065 -0.60179969 -0.38935545]
scalar_6 : [-0.5539804 1.71416683 0.24040183 -0.84028251 1.1723665 -0.26635766
-1.00405075 -0.9269535 0.65417661 -0.99195604]
}
# Initialize a dataset with the tabular data
dataset = initialize_dataset_with_tabular_data(tabular_data)
print("Initialized Dataset: ", dataset)
Initialized Dataset: Dataset(10 samples, 7 scalars, 0 fields)
Section 2: Accessing and Manipulating Data in the Dataset¶
# Retrieve and print the dataset and specific samples
sample_1 = dataset[1]
print(f"{sample_1 = }")
sample_1 = Sample(7 globals, 1 timestamp, 0 fields)
# Access and display the value of a particular scalar within a sample
scalar_value = sample_1.get_scalar("scalar_0")
print("Scalar 'scalar_0' in Sample 1:", scalar_value)
Scalar 'scalar_0' in Sample 1: 0.5819717437199943
# Retrieve tabular data from the dataset based on scalar names
scalar_names = ["scalar_1", "scalar_3", "scalar_5"]
tabular_data_subset = dataset.get_scalars_to_tabular(scalar_names)
print("Tabular Data Subset for Scalars 1, 3, and 5:")
dprint("tabular_data_subset", tabular_data_subset)
Tabular Data Subset for Scalars 1, 3, and 5:
tabular_data_subset {
scalar_1 : [-1.14554866 -1.82872028 0.74539072 0.72077254 -1.90526983 -1.01022042
-0.57791925 -0.15221116 -1.78209615 1.28911555]
scalar_3 : [ 0.0250144 -0.30140477 1.08292446 2.13233325 0.05831823 -1.1326039
-0.11535814 1.1376475 2.97777911 -2.82593091]
scalar_5 : [-1.46104057 0.13417581 -0.52833346 -0.23916237 0.7163114 -1.0312232
0.92713797 2.05439065 -0.60179969 -0.38935545]
}