{ "cells": [ { "cell_type": "markdown", "id": "5c400357", "metadata": {}, "source": [ "# Metrics Examples\n", "\n", "## Introduction\n", "This notebook demonstrates the use case of the `prepare_datasets`, `compute_metrics`, and `pretty_metrics` functions from the PLAID library. The function is used to compute metrics for comparing reference and predicted datasets based on a given problem definition.\n" ] }, { "cell_type": "code", "execution_count": null, "id": "1438cb32", "metadata": { "lines_to_next_cell": 2 }, "outputs": [], "source": [ "# Importing Required Libraries\n", "from pathlib import Path\n", "import os\n", "\n", "from plaid import Dataset\n", "from plaid.post.metrics import compute_metrics, prepare_datasets, pretty_metrics\n", "from plaid import ProblemDefinition" ] }, { "cell_type": "code", "execution_count": null, "id": "a4f512a7", "metadata": {}, "outputs": [], "source": [ "# Setting up Directories\n", "try:\n", " dataset_directory = Path(__file__).parent.parent.parent / \"tests\" / \"post\"\n", "except NameError:\n", " dataset_directory = Path(\"..\") / \"..\" / \"..\" / \"..\" / \"tests\" / \"post\"" ] }, { "cell_type": "markdown", "id": "4f162732", "metadata": {}, "source": [ "## Prepare Datasets for comparision\n", "\n", "Assuming you have reference and predicted datasets, and a problem definition, The `prepare_datasets` function is used to obtain output scalars for subsequent analysis." ] }, { "cell_type": "code", "execution_count": null, "id": "3e25b2f4", "metadata": {}, "outputs": [], "source": [ "# Load PLAID datasets and problem metadata objects\n", "ref_ds = Dataset(dataset_directory / \"dataset_ref\")\n", "pred_ds = Dataset(dataset_directory / \"dataset_near_pred\")\n", "problem = ProblemDefinition(dataset_directory / \"problem_definition\")\n", "\n", "# Get output scalars from reference and prediction dataset\n", "ref_out_scalars, pred_out_scalars, out_scalars_names = prepare_datasets(\n", " ref_ds, pred_ds, problem, verbose=True\n", ")\n", "\n", "print(f\"{out_scalars_names = }\\n\")" ] }, { "cell_type": "code", "execution_count": null, "id": "394513c1", "metadata": {}, "outputs": [], "source": [ "# Get output scalar\n", "key = out_scalars_names[0]\n", "\n", "print(f\"KEY '{key}':\\n\")\n", "print(f\"ID{' ' * 5}--REF_out_scalars--{' ' * 7}--PRED_out_scalars--\")\n", "\n", "# Print output scalar values for both datasets\n", "index = 0\n", "for item1, item2 in zip(ref_out_scalars[key], pred_out_scalars[key]):\n", " print(\n", " f\"{str(index).ljust(2)} | {str(item1).ljust(20)} | {str(item2).ljust(20)}\"\n", " )\n", " index += 1" ] }, { "cell_type": "markdown", "id": "ae42cefb", "metadata": {}, "source": [ "## Metrics with File Paths\n", "\n", "Here, we load the datasets and problem metadata from file paths and use the `compute_metrics` function to generate metrics for comparison. The resulting metrics are then printed in a structured dictionary format." ] }, { "cell_type": "code", "execution_count": null, "id": "ce691585", "metadata": {}, "outputs": [], "source": [ "print(\"=== Metrics with file paths ===\")\n", "\n", "# Load PLAID datasets and problem metadata file paths\n", "ref_ds = dataset_directory / \"dataset_ref\"\n", "pred_ds = dataset_directory / \"dataset_near_pred\"\n", "problem = dataset_directory / \"problem_definition\"\n", "\n", "# Using file paths to generate metrics\n", "metrics = compute_metrics(ref_ds, pred_ds, problem, \"first_metrics\")\n", "\n", "import json\n", "\n", "# Print the resulting metrics\n", "print(\"output dictionary =\", json.dumps(metrics, indent=4))" ] }, { "cell_type": "markdown", "id": "de69e0d1", "metadata": {}, "source": [ "## Metrics with PLAID Objects and Verbose\n", "\n", "In this section, we demonstrate how to use PLAID objects directly to generate metrics, and the verbose option is enabled to provide more detailed information during the computation." ] }, { "cell_type": "code", "execution_count": null, "id": "2a62906b", "metadata": {}, "outputs": [], "source": [ "print(\"=== Metrics with PLAID objects and verbose ===\")\n", "\n", "# Load PLAID datasets and problem metadata objects\n", "ref_ds = Dataset(dataset_directory / \"dataset_ref\")\n", "pred_ds = Dataset(dataset_directory / \"dataset_pred\")\n", "problem = ProblemDefinition(dataset_directory / \"problem_definition\")\n", "\n", "# Pretty print activated with verbose mode\n", "metrics = compute_metrics(ref_ds, pred_ds, problem, \"second_metrics\", verbose=True)" ] }, { "cell_type": "markdown", "id": "16b8e53f", "metadata": {}, "source": [ "## Print metrics in a beautiful way\n", "\n", "Finally, in this last section, we showcase a way to print metrics in a more aesthetically pleasing format using the `pretty_metrics` function. The provided dictionary is an example structure for representing metrics, and the function enhances the readability of the metrics presentation. (it is used by `compute_metrics` when verbose mode is activated)" ] }, { "cell_type": "code", "execution_count": null, "id": "e19505a6", "metadata": {}, "outputs": [], "source": [ "dictionary: dict = {\n", " \"RMSE:\": {\n", " \"train\": {\"scalar_1\": 0.12345, \"scalar_2\": 0.54321},\n", " \"test\": {\"scalar_1\": 0.56789, \"scalar_2\": 0.98765},\n", " }\n", "}\n", "\n", "pretty_metrics(dictionary)\n", "\n", "os.remove(\"first_metrics.yaml\")\n", "os.remove(\"second_metrics.yaml\")" ] } ], "metadata": { "jupytext": { "formats": "ipynb,py:percent" }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" } }, "nbformat": 4, "nbformat_minor": 5 }