plaid.viewer.services.plaid_dataset_service¶
plaid.viewer.services.plaid_dataset_service
¶
Dataset discovery and sample introspection for the PLAID viewer.
This service owns all PLAID-facing logic used by the viewer:
- Discover datasets under a configured root directory.
- Load a split-wise
(dataset_dict, converter_dict)pair through :func:plaid.storage.init_from_diskand cache it for subsequent calls. - Materialize PLAID :class:
plaid.Sampleinstances viaconverter.to_plaid(dataset, index), regardless of the underlying backend (hf_datasets,cgns,zarr...). - Summarize sample contents (bases, zones, fields, times, scalars).
- Report basic validation status via :meth:
Sample.check_completeness.
plaid.viewer.services.plaid_dataset_service.PlaidDatasetService
¶
High-level access to PLAID datasets stored under a root directory.
A dataset is a subdirectory of config.datasets_root that contains a
data/ directory readable by :func:plaid.storage.init_from_disk.
The function returns a dataset_dict and a converter_dict keyed
by split name; the viewer iterates splits and addresses samples by
integer index in range(len(dataset_dict[split])).
Source code in plaid/viewer/services/plaid_dataset_service.py
134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 | |
plaid.viewer.services.plaid_dataset_service.PlaidDatasetService.datasets_root
property
¶
Return the currently active datasets root, or None.
plaid.viewer.services.plaid_dataset_service.PlaidDatasetService.browse_roots
property
¶
Return the sandbox directories for interactive path selection.
plaid.viewer.services.plaid_dataset_service.PlaidDatasetService.hub_repos
property
¶
Return the list of registered Hugging Face Hub repositories.
plaid.viewer.services.plaid_dataset_service.PlaidDatasetService.set_datasets_root
¶
Change the active datasets root at runtime.
The new path (when not None) must exist, be a directory, and be
located under one of browse_roots. All per-dataset caches are
invalidated so the next discovery call reflects the new root.
Parameters:
-
path(Path | str | None) –The new datasets root.
Noneclears the current root.
Returns:
-
Path | None–The resolved new datasets root, or
Noneif cleared.
Raises:
-
ValueError–If the path does not exist, is not a directory, or escapes
browse_roots.
Source code in plaid/viewer/services/plaid_dataset_service.py
plaid.viewer.services.plaid_dataset_service.PlaidDatasetService.list_subdirs
¶
Return immediate subdirectories of path for the file browser.
Each entry is tagged with is_plaid_candidate (True when it
looks like a PLAID dataset, i.e. contains a data/ subdirectory)
so the UI can highlight it. The returned path is always an
absolute resolved path inside browse_roots.
Parameters:
-
path(Path | str | None, default:None) –Directory to list. When
Nonethe first browse root is used (typically$HOME).
Returns:
-
dict[str, object]–A dict ``{"path": str, "parent": str | None,
-
dict[str, object]–"entries": [{"name": str, "path": str,
-
dict[str, object]–"is_plaid_candidate": bool}, ...]}``.
Raises:
-
ValueError–If
pathis not a directory or escapes the sandbox.
Source code in plaid/viewer/services/plaid_dataset_service.py
plaid.viewer.services.plaid_dataset_service.PlaidDatasetService.list_datasets
¶
Return a summary of every dataset available to the viewer.
Local datasets (subdirectories of datasets_root) and registered
Hugging Face Hub repositories (added via :meth:add_hub_dataset)
are both included, in that order.
Source code in plaid/viewer/services/plaid_dataset_service.py
plaid.viewer.services.plaid_dataset_service.PlaidDatasetService.add_hub_dataset
¶
Register a Hugging Face Hub dataset to stream from.
The dataset is exposed through :func:plaid.storage.init_streaming_from_hub
and appears in :meth:list_datasets with dataset_id == repo_id.
Parameters:
-
repo_id(str) –Hugging Face repository identifier, e.g.
"PLAID-lib/VKI-LS59". Must contain a/separator.
Returns:
-
str–The normalised
repo_id.
Raises:
-
ValueError–If
repo_idis empty or does not look like anamespace/namepair.
Source code in plaid/viewer/services/plaid_dataset_service.py
plaid.viewer.services.plaid_dataset_service.PlaidDatasetService.remove_hub_dataset
¶
Unregister a previously added Hugging Face Hub dataset.
Source code in plaid/viewer/services/plaid_dataset_service.py
plaid.viewer.services.plaid_dataset_service.PlaidDatasetService.list_available_features
¶
Return the feature paths offered to the user for filtering.
The viewer only exposes paths that are CGNS fields (i.e. what
:func:plaid.containers.utils.get_feature_details_from_path
classifies as type == "field"). Globals, coordinates,
element connectivities, boundary conditions, etc. are hidden
because they are not what the user means when they want to
"filter the displayed features" in a 3D viewer.
Paths ending in _times (time-series bookkeeping duplicates
of a field, e.g. Base_.../FlowSolution/Pressure_times) are
also filtered out: they are artefacts of the temporal storage
layout, not distinct physical quantities the user would want to
toggle.
Source code in plaid/viewer/services/plaid_dataset_service.py
plaid.viewer.services.plaid_dataset_service.PlaidDatasetService.get_features
¶
Return the active feature filter for dataset_id.
None means "no filter": every feature is loaded (default
behaviour). An explicit empty list means "no feature selected".
Source code in plaid/viewer/services/plaid_dataset_service.py
plaid.viewer.services.plaid_dataset_service.PlaidDatasetService.list_available_bases
¶
Return the unique CGNS base prefixes (e.g. Base_2_2).
Derived from the constant/variable feature catalogues, so the
list is available before any sample has been loaded - which
lets the trame UI populate the "Base" toggle as soon as a
dataset is selected. The synthetic Globals base is
excluded; it is exposed separately by
:meth:list_globals_paths and surfaced as its own toggle in
the side drawer.
Source code in plaid/viewer/services/plaid_dataset_service.py
plaid.viewer.services.plaid_dataset_service.PlaidDatasetService.list_base_paths
¶
Return every PLAID feature path declared under base.
Used by the trame "Base" toggle to translate a base pick into
a concrete feature list passed to :meth:set_features. The
returned paths span both constant and variable schemas, so
:meth:Converter.to_plaid can rebuild the mesh of that base
(and any field declared at the dataset level) without pulling
in unrelated bases.
Base_X_Y and Base_X_Y_times paths are both returned
when present so the time-series bookkeeping companion of the
chosen base is loaded along with it.
Source code in plaid/viewer/services/plaid_dataset_service.py
plaid.viewer.services.plaid_dataset_service.PlaidDatasetService.list_globals_paths
¶
Return every PLAID feature path that lives under Global/.
Used by the trame UI to translate the "Globals" toggle into a
concrete feature list passed to :meth:set_features. PLAID
identifies sample-level scalars / tensors with a singular
Global base (and a companion Global_times base for time
series), so we accept exactly those two prefixes.
Source code in plaid/viewer/services/plaid_dataset_service.py
plaid.viewer.services.plaid_dataset_service.PlaidDatasetService.set_features
¶
Set (or clear) the active feature filter for dataset_id.
Only the user-visible field paths (those returned by
:meth:list_available_features) are stored. Geometric supports
(coordinates, element connectivities, boundary conditions,
GridLocation metadata, _times bookkeeping paths, ...)
required to render the selected fields are handled transparently
by :meth:Converter.to_plaid, which runs
:func:~plaid.utils.cgns_helper.update_features_for_CGNS_compatibility
internally against its own per-split
constant_features / variable_features catalogues. We
therefore never pre-expand the selection here - doing so would
use the dataset-wide (union) catalogue and, on splits whose
data does not contain the selected fields, would hand PLAID a
list of coordinates without the fields that justify them and
trigger Missing features in dataset/converter in the CGNS
expander.
For disk-backed datasets the filter is applied on every call to
:meth:Converter.to_plaid during :meth:load_sample. For
streaming (Hugging Face Hub) datasets it is injected into
:func:plaid.storage.init_streaming_from_hub before any
sample is consumed; we therefore invalidate the cached
(datasetdict, converterdict) and any open streaming cursors
so the next :meth:_open call rebuilds them with the new
feature list.
Parameters:
-
dataset_id(str) –Target dataset identifier.
-
features(list[str] | None) –Field paths to keep (subset of :meth:
list_available_features), orNoneto clear the filter and load every feature.
Returns:
-
list[str] | None–The normalised, deduplicated feature list (
Nonewhen no -
list[str] | None–filter is active).
Raises:
-
ValueError–If
featurescontains paths not declared in the dataset metadata.
Source code in plaid/viewer/services/plaid_dataset_service.py
plaid.viewer.services.plaid_dataset_service.PlaidDatasetService.is_streaming
¶
Return True when dataset_id is a Hugging Face Hub stream.
Streaming datasets have no __len__ on their splits and must be
navigated forward-only through :meth:advance_stream_cursor /
:meth:reset_stream_cursor rather than indexed.
Source code in plaid/viewer/services/plaid_dataset_service.py
plaid.viewer.services.plaid_dataset_service.PlaidDatasetService.get_dataset
¶
Return detailed information about a single dataset.
Source code in plaid/viewer/services/plaid_dataset_service.py
plaid.viewer.services.plaid_dataset_service.PlaidDatasetService.list_samples
¶
Return every sample reference available in a dataset.
For disk-backed datasets, sample ids are the zero-based integer
indices used with converter.to_plaid(dataset, index). For
streaming datasets (Hugging Face Hub), each split contributes a
single reference whose sample_id is the
:data:STREAM_CURSOR_ID sentinel; the actual sample is obtained
by advancing the per-split cursor with
:meth:advance_stream_cursor.
Source code in plaid/viewer/services/plaid_dataset_service.py
plaid.viewer.services.plaid_dataset_service.PlaidDatasetService.stream_cursor_position
¶
Return the current forward position of a streaming cursor.
Returns -1 before the first call to :meth:advance_stream_cursor.
Source code in plaid/viewer/services/plaid_dataset_service.py
plaid.viewer.services.plaid_dataset_service.PlaidDatasetService.advance_stream_cursor
¶
Consume the next record from the stream and return its ref.
The returned :class:SampleRef always carries the
:data:STREAM_CURSOR_ID sentinel in its sample_id; the
underlying record is cached on the service so a subsequent
:meth:load_sample call returns the freshly fetched sample.
Raises:
-
StopIteration–If the underlying stream is exhausted.
Source code in plaid/viewer/services/plaid_dataset_service.py
plaid.viewer.services.plaid_dataset_service.PlaidDatasetService.reset_stream_cursor
¶
Rebuild a fresh iterator for (dataset_id, split).
The cached record is discarded and the position reset to -1
so the next :meth:advance_stream_cursor call yields the first
sample again.
Source code in plaid/viewer/services/plaid_dataset_service.py
plaid.viewer.services.plaid_dataset_service.PlaidDatasetService.load_sample
¶
Return a PLAID :class:plaid.Sample for the given reference.
Uses converter.to_plaid(dataset, index) to rebuild the sample
from whatever backend store (hf_datasets, cgns, zarr) is in use.
For random-access (non-streaming) samples, results are memoised
in a small LRU keyed on
(dataset_id, split_key, sample_id, features_tuple) so that
repeated calls within the same UI interaction (summary, globals,
non-visual bases, paraview artifact build, playback frames, ...)
only incur a single Converter.to_plaid decode. The cache is
invalidated whenever the active feature filter, datasets root,
or hub registration changes. Streaming samples bypass the cache
because their sample_id is the constant
:data:STREAM_CURSOR_ID sentinel.
Source code in plaid/viewer/services/plaid_dataset_service.py
plaid.viewer.services.plaid_dataset_service.PlaidDatasetService.get_sample_summary
¶
Return a minimal summary of the PLAID sample.
Source code in plaid/viewer/services/plaid_dataset_service.py
plaid.viewer.services.plaid_dataset_service.PlaidDatasetService.list_time_values
¶
Return the sorted list of time values available for a sample.
Thin wrapper around :meth:plaid.Sample.get_all_time_values
that always returns a list[float] (it may be empty for static
samples).
Source code in plaid/viewer/services/plaid_dataset_service.py
plaid.viewer.services.plaid_dataset_service.PlaidDatasetService.describe_globals
¶
Return PLAID global scalars/tensors reported by the sample.
Uses :meth:plaid.Sample.get_global_names to enumerate globals
and :meth:plaid.Sample.get_global to fetch each value, so only
the "real" globals exposed by PLAID's API are reported. The CGNS
bookkeeping arrays IterationValues and TimeValues (which
describe time steps, not physical scalars) are filtered out.
Parameters:
-
ref(SampleRef) –The sample to inspect.
-
time(float | None, default:None) –Optional time value; when
Nonethe sample's first available time (or the static value) is used.
Returns:
-
list[dict[str, object]]–A list of ``{"name": str, "shape": list[int], "dtype": str,
-
list[dict[str, object]]–"preview": str | None}`` descriptors, one per global.
Source code in plaid/viewer/services/plaid_dataset_service.py
plaid.viewer.services.plaid_dataset_service.PlaidDatasetService.describe_globals_all_times
¶
Return globals descriptors for every time step of a sample.
Performs a single :meth:load_sample call (which goes through
the in-memory LRU) and then iterates over every time value
exposed by :meth:plaid.Sample.get_all_time_values,
building a {time: [globals_entries]} snapshot. The trame
playback loop can then refresh the globals panel by indexing
into this dict instead of triggering a fresh service call (and
risking a per-frame decode on backends that bypass the cache).
Parameters:
-
ref(SampleRef) –The sample to inspect.
Returns:
-
list[dict[str, object]]–A pair
(static, by_time)wherestaticis the -
dict[float, list[dict[str, object]]]–time-less globals listing (used as a fallback when the
-
tuple[list[dict[str, object]], dict[float, list[dict[str, object]]]]–sample has no time axis or the requested time is missing)
-
tuple[list[dict[str, object]], dict[float, list[dict[str, object]]]]–and
by_timemaps each availablefloattime value -
tuple[list[dict[str, object]], dict[float, list[dict[str, object]]]]–to its globals descriptors.
Source code in plaid/viewer/services/plaid_dataset_service.py
plaid.viewer.services.plaid_dataset_service.PlaidDatasetService.describe_non_visual_bases
¶
Return data arrays of CGNS bases that carry no zones.
Some datasets store auxiliary tensors (constants, global reference
values, look-up tables, ...) inside a CGNS base that has no
Zone_t children, so VTK cannot render them as geometry. This
method returns, for each zone-less base, a list of descriptors
{"name": str, "shape": list[int], "dtype": str,
"preview": str | None} suitable for display in the viewer.
Parameters:
-
ref(SampleRef) –The sample to inspect.
Returns:
-
dict[str, list[dict[str, object]]]–A mapping from base name to a list of data-array descriptors.
-
dict[str, list[dict[str, object]]]–Bases that do contain zones are omitted.
Source code in plaid/viewer/services/plaid_dataset_service.py
plaid.viewer.services.plaid_dataset_service.PlaidDatasetService.get_sample_validation
¶
Check basic sample completeness using PLAID's built-in validator.