ESA title

AIREO Resources

AIREO Training Dataset Specification

The full Specification & Metadata Spreadsheet can be downloaded here:


The AIREO TDS specification:

  • Has data provenance and processing history captured
  • Explicitly considers paired (reference data and EO features) and unpaired TDS
  • Adopts and adapts STAC (cloud-native data) metadata and cataloging specification
  • Introduces AIREO TDS compliance levels (L1, L2, L3)
  • Defines critical metadata attributes for TDS
  • Provides data format independence by using a JSON complement to the source TDS
  • Is self-explanatory (AI-ready), building on data sheet descriptions
  • Aims to have feature engineering recipes embedded for cross-platform compatibility
  • Proposes structured Quality Assurance approaches for TDS, including several automatically generated quality indicators
  • Considers granularity, including collection, dataset and measurement level

FAIR (findable, accessible, interoperable and re-usable) data principles are at the heart of this specification.

In order to make TDS more re-usable, the specification aims to standardize the metadata to be included with an AIREO TDS and proposes sets of required, recommended and optional metadata elements. These are based on existing OGC and STAC metadata relevant to EO data and ML applications and some describe AIREO innovative elements, also with the aim of ‘FAIR-ifying’ TDS.

A set of Quality Indicator metadata are included, to help data providers publish structured data quality estimates and elements for users. Additional metadata are designed to ensure ‘FAIRness’ of the dataset. A number of metadata elements describe data provenance: source, processing history and feature engineering recipes. Documentation of licence information and details of accessibility are also included. These can be included in automatic checking to estimate how FAIR the data is.

A key innovation is the definition of AIREO Compliance Levels, allowing a user to see at a glance whether a dataset is fully described in metadata and FAIR-compliant, or whether it contains only the required or recommended set of metadata. These would allow the dataset to be used, but it would not be fully characterised in terms of Quality Assurance and documentation.

Rate this resource


Avarage rating:

Leave a comment

Your email address will not be published. Required fields are marked *



To subscribe to the AIREO network or to contact the AIREO Team please email