FAIR Guiding Principles

The definitions and descriptions of the fair principles and guidelines are presented in detail by Mark Wilkinson et al. in the journal “Nature” and can be found online: Nature

The FAIR principles are based on the urgent need to improve the infrastructure for scientific data. These principles are an attempt to make data automatically discoverable by machines and thus increase its usability by individuals.

This page provides definitions and help regarding the four principles Findability, Accessibility, Interoperability, and Reusability in the FAIR-Universe. These principles serve as a roadmap for data producers and publishers, aiding them in overcoming challenges and optimizing the added value derived from contemporary scholarly digital publishing. It is crucial to note that these principles extend beyond traditional ‘data’ and encompass algorithms, tools, and workflows that contribute to the generation of data. Whether it’s data or analytical pipelines, the application of these principles is essential for all scholarly digital research objects. Transparency, reproducibility, and reusability are ensured when every component of the research process is made available.

Findable

Defintion

F1. Identifier

(Meta)data are assigned a globally unique and persistent identifier

Both data and metadata (information describing the data) are given an identification code that is globally unique and remains constant over time. This identifier is unique on a global scale, ensuring that no other data or metadata share the same identifier. Additionally, the identifier remains persistent, meaning it doesn’t change over time or under different circumstances, providing a stable reference point for locating and accessing the specific data or metadata. This practice is crucial for effective data management, sharing, and tracking, as it enables unambiguous and long-term identification of the associated information.

Example: Digital Object Identifier (DOI)

F2. Rich Metadata

Data are described with rich metadata (defined by R1 below)

Metadata provides extensive information about the data, going beyond basic or minimal descriptors. Rich metadata typically includes details such as the origin of the data, the methodology used to collect it, the format, relevant dates, and other contextual information that enhances understanding and usability.

F3. Reference to identifier

Metadata clearly and explicitly include the identifier of the data it describes.

In other words, within the metadata, there is a direct and unmistakable indication of the identifier associated with the corresponding dataset.Having the identifier explicitly mentioned in the metadata helps establish a direct connection between the descriptive information (metadata) and the actual data it describes, facilitating proper identification, retrieval, and management of the data.

F4. Searchable Results

(meta)data are registered or indexed in a searchable resource

Data or metadata (information describing the data) is officially recorded or listed in a system or database that allows for easy retrieval through searching.

Accessible

definition

A1. Common protocol

(Meta)data are retrievable by their identifier using a standardized communications protocol

The data or metadata can be accessed or retrieved based on a specific identifier assigned to them. Each piece of data or metadata has a unique identifier that allows for precise retrieval.

The retrieval process follows a standardized set of rules or conventions for communication. This typically involves a well-defined protocol, which is a set of rules or guidelines that facilitate the exchange of information between systems. Standardization ensures consistency and interoperability, allowing different systems to communicate seamlessly.

A1.1. open, free, universal

The protocol is open, free, and universally implementable

This means, protocol is one that is transparent, cost-free, and accessible to a wide range of users, fostering widespread adoption and collaboration in the implementation of the communication standards it defines.

A1.2. Authentication and authorization

the protocol allows for an authentication and authorization procedure, where necessary

Authentication: This is the process of verifying the identity of a user, device, or system. Authentication ensures that the entity trying to access a resource is who it claims to be. Common methods of authentication include passwords, digital certificates, or other credentials.

Authorization: Once an entity is authenticated, authorization determines what actions or resources that entity is allowed to access. It involves granting specific permissions or rights based on the authenticated identity. Authorization mechanisms control what a user or system can do within a given system or application.

For example: Dataverse provides public machine-accessible interfaces to search the data, access the metadata and download the data files, using a token to grant access when data files are restricted.

A2. Metadata remain

Metadata are accessible, even when the data are no longer available.

This practice ensures that users can still gain insights into the characteristics, context, and other details of the data, even if the data itself is no longer accessible due to reasons such as deletion, relocation, or unavailability.

In essence, separating metadata accessibility from data availability allows users to understand the context and content of the data, even if the data files are no longer present or accessible. This approach supports transparency, documentation, and the potential for reusing or understanding the context of research or datasets, even if the data is no longer directly accessible.

Interoperable

I1. Knowledgelanguage

(Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.

The choice of language enhances interoperability, facilitates collaboration, and ensures that information about the data is communicated effectively across different communities or systems.

I2. Vocabulary

(Meta)data use vocabularies that follow FAIR principles

Vocabularies or terminologies employed in describing metadata adhere to the FAIR principles. Enhancing the overall quality and utility of data by promoting its findability, accessibility, interoperability, and reusability through the use of well-structured and standardized terminology.

I3. Reference to other metadata

(Meta)data include qualified references to other (meta)data

References can enrich the understanding of the data by connecting it to related or complementary information, fostering a more comprehensive view of the dataset and its context.

Reusable

R1. Attributes

Meta(data) are richly described with a plurality of accurate and relevant attributes

metadata is not only detailed and accurate but also covers a variety of relevant aspects, offering a comprehensive and nuanced description of the associated dataset. This richness in description enhances the usability, interpretability, and overall quality of the metadata and, consequently, the data it describes.

R1.1 Licence

(Meta)data are released with a clear and accessible data usage license

By providing a clear and accessible data usage license, data publishers communicate the permissions, restrictions, and obligations associated with using the data, promoting responsible and lawful use within the defined terms.

A data licence represents an arrangement between creator and end-user, or simply from what platform the deposited data will be accessible. It clarifies what the user can and can not do the the data.

Example for licenses are: CC licences, Copyright, Open Data Commons Open Database License (ODbL)

R1.2. Provenance

(Meta)data are associated with detailed provenance

Detailed provenance ensures transparency and accountability regarding the history and lineage of the data, promoting trust, reproducibility, and informed use of the dataset.

Provenance in this context refers to the history of the data, documenting its creation, sources, transformations, and any changes it has undergone over time.

R1.3. Standards

(Meta)data meet domain-relevant community standards

metadata associated with the dataset follows the conventions and expectations of a particular community or field, ensuring that the information is structured and described in a way that is consistent with the established norms of that specific domain. Adhering to these community standards facilitates interoperability, collaboration, and effective communication within a particular scientific or research community.