diff --git a/docs/user_guide.rst b/docs/user_guide.rst index ba6cf4f5..695a4c58 100644 --- a/docs/user_guide.rst +++ b/docs/user_guide.rst @@ -49,4 +49,4 @@ the RegionProcessor and validated using DataStructureDefinition. user_guide/model-registration user_guide/config user_guide/local-usage - user_guide/data-validation + user_guide/validation diff --git a/docs/user_guide/validation.rst b/docs/user_guide/validation.rst new file mode 100644 index 00000000..83b23e91 --- /dev/null +++ b/docs/user_guide/validation.rst @@ -0,0 +1,17 @@ +.. _validation: + +.. currentmodule:: nomenclature + +Data validation +=============== + +The **nomenclature** package allows users to validate IAMC data in several ways. + +For this, validation requirements and criteria can be specified in YAML configuration +files. + +.. toctree:: + :maxdepth: 1 + + validation/data-validation + validation/required-data-validation diff --git a/docs/user_guide/data-validation.rst b/docs/user_guide/validation/data-validation.rst similarity index 62% rename from docs/user_guide/data-validation.rst rename to docs/user_guide/validation/data-validation.rst index 90a34eb7..0a5895b8 100644 --- a/docs/user_guide/data-validation.rst +++ b/docs/user_guide/validation/data-validation.rst @@ -5,40 +5,6 @@ Data validation =============== -The **nomenclature** package allows users to validate IAMC data in several ways. - -For this, validation requirements and criteria can be specified in YAML configuration -files. - -Required data validation ------------------------- - -**Required data validation** checks if certain models, variables, regions and/or -periods of time are covered in the datapoints. - -For this, a configuration file specifies the model(s) and dimension(s) expected -in the dataset. These are ``variable``, ``region`` and/or ``year``. -Alternatively, instead of using ``variable``, it is possible to declare measurands, -which jointly specify variables and units. - -.. code:: yaml - - description: Required variables for running MAGICC - model: model_a - required_data: - - measurand: - Emissions|CO2: - unit: Mt CO2/yr - region: World - year: [2020, 2030, 2040, 2050] - -In the example above, for *model_a*, the dataset must include datapoints of the -variable *Emissions|CO2* (measured in *Mt CO2/yr*), in the region *World*, for the -years 2020, 2030, 2040 and 2050. - -Data validation ---------------- - **Data validation** checks if data values are within specified ranges. Consider the example below: @@ -87,15 +53,12 @@ a separate criteria item) and slower to process. Standard usage -------------- -Run the following in a Python script to check that an IAMC dataset has valid -(required) data. +Run the following in a Python script to check that an IAMC dataset has valid data. .. code-block:: python - from nomenclature import RequiredDataValidator from nomenclature.processor import DataValidator # ...setting directory/file paths and loading dataset - RequiredDataValidator.from_file(req_data_yaml).apply(df) DataValidator.from_file(data_val_yaml).apply(df) diff --git a/docs/user_guide/validation/required-data-validation.rst b/docs/user_guide/validation/required-data-validation.rst new file mode 100644 index 00000000..3cd47f54 --- /dev/null +++ b/docs/user_guide/validation/required-data-validation.rst @@ -0,0 +1,43 @@ +.. _required-data-validation: + +.. currentmodule:: nomenclature + +Required data validation +======================== + +**Required data validation** checks if certain models, variables, regions and/or +periods of time are covered in the datapoints. + +For this, a configuration file specifies the model(s) and dimension(s) expected +in the dataset. These are ``variable``, ``region`` and/or ``year``. +Alternatively, instead of using ``variable``, it is possible to declare measurands, +which jointly specify variables and units. + +.. code:: yaml + + description: Required variables for running MAGICC + model: model_a + required_data: + - measurand: + Emissions|CO2: + unit: Mt CO2/yr + region: World + year: [2020, 2030, 2040, 2050] + +In the example above, for *model_a*, the dataset must include datapoints of the +variable *Emissions|CO2* (measured in *Mt CO2/yr*), in the region *World*, for the +years 2020, 2030, 2040 and 2050. + +Standard usage +-------------- + +Run the following in a Python script to check that an IAMC dataset has valid +required data. + +.. code-block:: python + + from nomenclature import RequiredDataValidator + + # ...setting directory/file paths and loading dataset + + RequiredDataValidator.from_file(req_data_yaml).apply(df)