Initializing a Dataset with Tabular Data

  1. Initializing a Dataset with Tabular Data:

  • Generate random tabular data for multiple scalars.

  • Initialize a dataset with the tabular data.

  1. Accessing and Manipulating Data in the Dataset:

  • Retrieve and print the dataset and specific samples.

  • Access and display the value of a particular scalar within a sample.

  • Retrieve tabular data from the dataset based on scalar names.

This example demonstrates how to initialize a dataset with tabular data, access specific samples, retrieve scalar values, and extract tabular data based on scalar names.

# Import required libraries
import numpy as np
# Import necessary libraries and functions
from plaid.utils.init_with_tabular import initialize_dataset_with_tabular_data
# Print dict util
def dprint(name: str, dictio: dict):
    print(name, "{")
    for key, value in dictio.items():
        print("    ", key, ":", value)

    print("}")

Section 1: Initializing a Dataset with Tabular Data

# Generate random tabular data for multiple scalars
nb_scalars = 7
nb_samples = 10
names = [f"scalar_{j}" for j in range(nb_scalars)]

tabular_data = {}
for name in names:
    tabular_data[name] = np.random.randn(nb_samples)

dprint("tabular_data", tabular_data)
tabular_data {
     scalar_0 : [-1.10227525 -0.65066961  0.37756     1.5489911  -0.47547023 -0.56480219
  0.55999345  1.28572067 -0.95152752  2.16073146]
     scalar_1 : [ 0.70992886  2.18653856 -1.51817988  0.93210383 -0.50735673 -0.56203609
  0.58955331 -0.13386108  0.02964018 -0.18856828]
     scalar_2 : [-0.00268882 -1.1557336  -0.63222735  1.30355936 -2.32795344 -0.67725776
  1.47079743 -0.37335718 -0.51192814 -0.94012624]
     scalar_3 : [-0.08623664 -1.8070916  -0.38619911  0.46730332  0.71152495 -0.1066194
 -2.06537501 -0.07143722 -1.01676683  0.26100002]
     scalar_4 : [-0.54031942  1.31588041 -0.77219106 -0.00913094  0.18666597  0.27180463
  0.99621116 -0.23592747 -1.05006049 -0.61395605]
     scalar_5 : [-1.39250922  1.19145457  1.33696053  1.52522562 -0.73688856 -0.57967617
 -0.22313158 -0.90367699  1.93874027  1.1432539 ]
     scalar_6 : [ 0.44367466  0.82612495  0.19298794  0.03554421  0.25240948 -0.04781558
 -0.25706572  1.96215118  1.15423612  0.04512218]
}
# Initialize a dataset with the tabular data
dataset = initialize_dataset_with_tabular_data(tabular_data)
print("Initialized Dataset: ", dataset)
Initialized Dataset:  Dataset(10 samples, 7 scalars, 0 time_series, 0 fields)

Section 2: Accessing and Manipulating Data in the Dataset

# Retrieve and print the dataset and specific samples
sample_1 = dataset[1]
print(f"{sample_1 = }")
sample_1 = Sample(path=None, meshes=<plaid.containers.features.meshes.SampleMeshes object at 0x7618b2df1350>, scalars=<plaid.containers.features.scalars.SampleScalars object at 0x7618b1ef3fd0>, time_series=None)
# Access and display the value of a particular scalar within a sample
scalar_value = sample_1.get_scalar("scalar_0")
print("Scalar 'scalar_0' in Sample 1:", scalar_value)
Scalar 'scalar_0' in Sample 1: -0.6506696058107521
# Retrieve tabular data from the dataset based on scalar names
scalar_names = ["scalar_1", "scalar_3", "scalar_5"]
tabular_data_subset = dataset.get_scalars_to_tabular(scalar_names)
print("Tabular Data Subset for Scalars 1, 3, and 5:")
dprint("tabular_data_subset", tabular_data_subset)
Tabular Data Subset for Scalars 1, 3, and 5:
tabular_data_subset {
     scalar_1 : [ 0.70992886  2.18653856 -1.51817988  0.93210383 -0.50735673 -0.56203609
  0.58955331 -0.13386108  0.02964018 -0.18856828]
     scalar_3 : [-0.08623664 -1.8070916  -0.38619911  0.46730332  0.71152495 -0.1066194
 -2.06537501 -0.07143722 -1.01676683  0.26100002]
     scalar_5 : [-1.39250922  1.19145457  1.33696053  1.52522562 -0.73688856 -0.57967617
 -0.22313158 -0.90367699  1.93874027  1.1432539 ]
}