plaid.storage.cgns.writer ========================= .. py:module:: plaid.storage.cgns.writer .. autoapi-nested-parse:: CGNS dataset writer module. This module provides functionality for writing datasets in CGNS format for the PLAID library. It includes utilities for generating datasets from sample generators, saving to disk, uploading to Hugging Face Hub, and configuring dataset cards. Functions --------- .. autoapisummary:: plaid.storage.cgns.writer.generate_datasetdict_to_disk plaid.storage.cgns.writer.push_local_datasetdict_to_hub plaid.storage.cgns.writer.configure_dataset_card Module Contents --------------- .. py:function:: generate_datasetdict_to_disk(output_folder: Union[str, pathlib.Path], generators: dict[str, Callable[Ellipsis, Generator[plaid.Sample, None, None]]], variable_schema: Optional[dict[str, dict]] = None, gen_kwargs: Optional[dict[str, dict[str, list[plaid.types.IndexType]]]] = None, num_proc: int = 1, verbose: bool = False) -> None Generates and saves a dataset to disk in CGNS format. :param output_folder: Base directory to save the dataset. :param generators: Dict of split generators. :param variable_schema: Unused variable schema. :param gen_kwargs: Optional generator kwargs for parallel processing. :param num_proc: Number of processes. :param verbose: Whether to show progress. .. py:function:: push_local_datasetdict_to_hub(repo_id: str, local_dir: Union[str, pathlib.Path], num_workers: int = 1) -> None Pushes a local dataset directory to Hugging Face Hub. :param repo_id: The repository ID. :param local_dir: Local directory path. :param num_workers: Number of upload workers. .. py:function:: configure_dataset_card(repo_id: str, infos: dict[str, dict[str, str]], local_dir: Union[str, pathlib.Path], variable_schema: Optional[dict] = None, viewer: Optional[bool] = None, pretty_name: Optional[str] = None, dataset_long_description: Optional[str] = None, illustration_urls: Optional[list[str]] = None, arxiv_paper_urls: Optional[list[str]] = None) -> None Configures and pushes a dataset card to Hugging Face Hub for a CGNS backend dataset. This function generates a dataset card in YAML format with metadata, features, splits information, and usage examples. It automatically detects splits and sample counts from the local directory structure, then pushes the card to the specified Hugging Face repository. :param repo_id: The Hugging Face repository ID where the dataset card will be pushed. :type repo_id: str :param infos: Dictionary containing dataset metadata, including legal information like license. :type infos: dict[str, dict[str, str]] :param local_dir: Path to the local directory containing the dataset files, expected to have a 'data' subdirectory with split folders. :type local_dir: Union[str, Path] :param variable_schema: Schema describing the variables/features in the dataset, used to generate the features section in the card. :type variable_schema: Optional[dict] :param viewer: Unused parameter for viewer configuration. :type viewer: Optional[bool] :param pretty_name: A human-readable name for the dataset to display in the card. :type pretty_name: Optional[str] :param dataset_long_description: A detailed description of the dataset to include in the card. :type dataset_long_description: Optional[str] :param illustration_urls: List of URLs to images that illustrate the dataset, displayed in the card. :type illustration_urls: Optional[list[str]] :param arxiv_paper_urls: List of arXiv URLs for papers related to the dataset, included as sources. :type arxiv_paper_urls: Optional[list[str]] :returns: This function does not return a value; it pushes the dataset card directly to Hugging Face Hub. :rtype: None