FAIR data

Publishing FAIR data#

  • Data persistence: Data repositories, such as Figshare, Zenodo, DataDryad, Kaggle Datasets and many others, are a good way to ensure dataset persistence. Many of these repositories have agreements with libraries to preserve data in perpetuity.

  • Provenance: with datasets often published in multiple repositories, it would be useful for repositories to describe the provenance information more explicitly in the metadata.

    • The provenance information helps users understand who collected the data, where the primary source of the dataset is, or how it might have changed.
    • The prov ontology can be used.
  • Licensing: datasets should include licensing information, ideally in a machine-readable format.

  • Assigning persistent identifiers, such as DOIs (Digital Object Identifier), is critical for long-term tracking and useability.

    • Easier citation of datasets and version tracking
    • Dereferenceable: if a dataset moves, the identifier can point to a different location.
Source

Recommendations adapted from this article.

Assessing your data FAIRness#

You can find a list of various websites to assess if a resource is FAIR at https://fairassist.org

From all those tools, here is our expert selection:

You can also find guidelines on what you need to provide to insure your data is FAIR in this FAIR data checklist file developed at IDS. Feel free to propose improvements in the best-practices repository issues.

The FAIR principles#

4 guiding principles

The FAIR (Findable, Accessible, Interoperable, Reusable) Guiding Principles are intended to facilitate the discovery and reuse of data, not only for people, but for machines. Read the full paper here.

Findable ๐Ÿ”Ž#

The first step in (re)using data is to find them. Metadata and data should be easy to find for both humans and computers. Machine-readable metadata are essential for automatic discovery of datasets and services, so this is an essential component of the FAIRification process.

Accessible ๐Ÿ“‚#

Once the user finds the required data, she/he needs to know how can they be accessed, possibly including authentication and authorisation.

Interoperable โš™๏ธ#

The data usually need to be integrated with other data. In addition, the data need to interoperate with applications or workflows for analysis, storage, and processing.

Reusable โ™ป๏ธ#

The ultimate goal of FAIR is to optimise the reuse of data. To achieve this, metadata and data should be well-described so that they can be replicated and/or combined in different settings.

3 FAIR entities

The principles refer to three types of entities: data (or any digital object), metadata (information about that digital object), and infrastructure.

For instance, principle F4 defines that both metadata and data are registered or indexed in a searchable resource (the infrastructure component).

Source

From the GO-FAIR website.

Last updated on by Vincent Emonet