What are data availability statements?
A data availability statement is a short paragraph that specifies where the data that are underlying a scientific publication can be found. The statement indicates the location of the dataset by providing a persistent web link, for example a DOI link (e.g. https://doi.org/10.3929/ethz-b-000315707). Typically, these statements are included directly in a scientific publication. Scientific journals increasingly require authors to include data availability statements. Beyond such recommendations, providing data availability statements can be considered good practice in research data management and research integrity. Even in situations when no research data in a narrow sense are underlying a publication (e.g., for literature reviews or in opinion pieces), providing a data availability statement can ensure clarity about the fact that no data exist that could be linked to the article (see example “5.” below).
What are the advantages of data availability statements and why should I include one in my publication?
- It allows other researchers to easily find and access your research data without additional communication effort,
- It makes data more FAIR by ensuring findability and enabling reuse,
- It ensures compliance with funder requirements and/or journal requirements, where applicable.
- It increases the citation impact of publications on average (Colavizza et al. 2020).
What should I take into consideration when preparing my data availability statement?
Ideally, the location of your data, which you should indicate in your statement, is a well-established and FAIR data repository in your discipline. Such a repository usually provides you with a persistent identifier that can be included in the data availability statement.
Make sure that your chosen solution for making data available is in line with the requirements of your funder and/or of the publisher. Not all example solutions provided below might be an appropriate choice for you. For example, providing research data underlying a publication only on request does often not comply with research funder policies.
The following example formulations are a selection from the Data Availability Statements of Springer Nature (accessed 10.03.2022). Use conditions: “In the absence of specific instructions from a journal editor authors can use or adapt the statement(s)”.
- "The datasets generated during and/or analysed during the current study are available in the [NAME] repository, [PERSISTENT WEB LINK TO DATASETS]."
- "All data generated or analysed during this study are included in this published article (and its supplementary information files)."
- "The data that support the findings of this study are available from [THIRD PARTY NAME] but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are however available from the authors upon reasonable request and with permission of [THIRD PARTY NAME]"
- "The datasets generated during and/or analysed during the current study are not publicly available due to [REASON(S) WHY DATA ARE NOT PUBLIC] but are available from the corresponding author on reasonable request."
- No underlying data are available for this article, since no datasets were generated or analysed during this study.
Colavizza G, Hrynaszkiewicz I, Staden I, Whitaker K, McGillivray B (2020) The citation advantage of linking publications to research data. PLoS ONE 15(4): e0230416. doi.org/10.1371/journal.pone.0230416