ETH Data Archive
We offer advice on issues in managing your data from the creation of a data management plan and the work on your project up to publication and preservation. In this, we take into account available infrastructures at ETH Zurich and requirements of funding agencies. We offer regular and on demand courses and workshops on various topics along the data lifecycle.
We will gladly help with the appraisal of other service-providers or support you in planning your research group’s or organisational unit’s data management. If required, we will put you in contact with our partners, e.g. the IT Services or ETH Zurich University Archives.
You find more questions and answers on research data management further below.
Metadata are “data about data”, i.e. information that describes the actual data.
On the one hand, this information helps to retrieve data. Especially in the case of research data, metadata should also provide sufficient information to be able to interpret and reuse data in a scientifically correct way.
There are also technical metadata, which are designed to enable digital objects to remain usable for longer periods – ideally permanently – and rights metadata, which provide information on who is allowed to use which objects when and under which conditions.
No. Nevertheless, you are required to arrange the storage of certain research data in accordance with ETH Zurich’s guidelines for good scientific practice.
Some funding agencies also have specifications for data storage. However, you may usually decide yourself which data archive or service best meets your requirements. You find further information on repositories below.
If you choose to publish and preserve data at ETH Zurich, the route via an upload to ETH Research Collection automatically also leads to long-term preservation of most files in the ETH Data Archive.
ETH Library recommends a (1) non-commercial repository adhering (2) to the FAIR principles1 and (3) storing data in Switzerland or the EU. Examples for this are the ETH Research Collection or the repository Zenodo. Other repositories (e.g. discipline specific ones) are listed at re3data. A (non-exclusive) recommendation of accepted repositories from the Swiss National Science Foundation SNSF is available on their website. Please check, if your funder has issued specific requirements.
We are happy to support you e.g. with the evaluation of data services in your field.
Yes, please do so in good time, as soon as retirement becomes an issue for you. Together with you, we will find suitable solutions. We will always include the possibilities in your department, e.g. storing large data volumes on the free of charge Long Term Storage (LTS) of ETH IT Services with support from your IT Support Group.
Your first point of contact are the ETH Zurich University Archives, which we will gladly put you in touch with. The University Archives also use the ETH Data Archive to archive digital documents and historical Websites of ETH Zurich.
As the workflows involved in processing these documents differ from those for other data, the responsibility is with the University Archives, which work closely with the Research Data Management and Digital Curation Group.
Long-term preservation in the ETH Data Archive
ETH Data Archive is ETH Zurich’s long-term digital archive. It is available for research data, documents and other data from libraries, collections and archives.
For the individual upload of data and for publishing data please use the ETH Research Collection.
The regular way into the ETH Data Archive is via the Research Collection. Data you upload to the Research Collection will in addition be transferred to the ETH Data Archive for long-term preservation (except for files deposited via Libdrive). If an upload to the Research Collection is not an option, please contact us with your concern.
Other solutions can be agreed on, especially for the one-off submission of data.
In principle, you can submit all file formats to the ETH Data Archive. Please arrange which formats you would like to submit beforehand, however, as not all formats are equally suitable for long-term preservation and use. You find more information and recommendations in our list of recommended file formats.
For many formats, especially manufacturer-specific formats that are not openly documented, we can only guarantee unaltered storage in the format provided (bitstream preservation).
Further measures, such as the conversion into a more up-to-date file format, are normally only possible for a few openly documented standard formats.
As a rule, researchers at ETH Zurich can use the ETH Data Archive free of charge. Institutional customers will be charged the costs of the storage quota they occupy and of additional services.
As the initial situation and amount of work can differ greatly depending on each use case, please contact us directly.
Please consult the Research Collection Manual for detailed information.
No, this is not possible. Without metadata, there is no chance to preserve the usability of data and to interpret and re-use them in the future.
If you are looking for a solution for storing large data volumes without metadata we can put you in touch with ETH Zurich’s IT Services or you may talk to your IT Support Group.
Yes, that is possible. Already when you submit research data to the ETH Data Archive via the Research Collection, you can stipulate for how long research data should be stored if they are not of lasting value. The shortest storage period is ten years.
Once this time limit has expired, the ETH Zurich University Archives can decide whether to accept permanent responsibility for certain data for historical reasons in consultation with the data Producers.
Access and reuse of data
Please be aware that today research data are published via the Research Collection where you can easily find them.
For a full list of datasets from the Research Collection together with the remaining accessible datasets and registered software in the ETH Data Archive, please filter search results in the search platform ETH Library@swisscovery by resource type “Research Data” and Data Sources “ETH Research Collection” and “ETH Data Archive”.
This is usually not the case. Please use the Research Collection for publishing your data. The Research Collection registers a DOI (Digital Object Identifier) which makes your data permanently citable and facilitates easy access.
No. The ETH Data Archive is not intended as a publication platform. Please use the Research Collection for publishing your data.
Research data management
Research data are data that arise during planning, performance and documentation of scientific work and which form the basis for conclusions and new findings. Types of research data can vary greatly and depend on the research field. A list of popular data types and corresponding file formats that may arise during a research project is indicated just below.
- text (e.g. *.pdf, *.txt, *.docx, *.rtf, *.odt, etc.)
- code (e.g. *.mat, *.RData, *.py, *.r, etc.)
- spreadsheets and tables (e.g. *.csv, *.xlsx, *.odt, etc.)
- raw data and workspaces (e.g. *.mat, *.nc, *.cfd, *.h5, *.hdf5, etc.)
- raster graphics (e.g. *.tif, *.gif, *.bmp, *.jpg, etc.)
- vector graphics (e.g. *.svg, *.indd, *.eps, *.psd, etc.)
- CAD (*.dwg, *.dxf, *.x3d, etc.)
- sound, audio (*.wav, *.mp4, *.mp3, etc.)
- video (e.g. *.mpg, *.mp4, *.mov, *.avi, *.wmv, etc.)
- all types of physical samples, or your inventory
The term research data management encompasses all technical, methodical, conceptional and organisational actions along the life cycle of your research data, which are crucial for the evaluation, validation and prospective reuse of these data.
To ensure that data can be used in the longer term, it should already be documented as thoroughly as possible from the outset. It is also worth agreeing on certain rules for the organisation and naming of files within a research group.
Depending on the amount of data you expect, you should contact the IT Support Group (ISG) responsible for your group early on so that the necessary resources can be planned and suitable solutions can be found.
Mandatory requirements for the storage and publication of your data are not uncommon with funders and often you must describe your intentions in a data management plan (DMP).
We will gladly help you with these considerations. Please send your request to email@example.com
Data publication and data archiving
In case of data publication, research data are made available to a wider audience for the very first time. ETH researchers can publish their data in the internal, FAIR repository ETH Research Collection.
Archiving encompasses data storage and associated measures that enable prospective data reuse and which are often time-limited. Research data intended for long-term preservation can be uploaded via the ETH Research Collection and will be automatically copied into the ETH Data Archive for long-term preservation.
We recommend that you cite a research dataset with the following minimum specifications: Creator (PublicationYear): Title. Publisher. Identifier
Swaminathan, R., Ramya, T., Karthik, C.S. (2013): Contortrostatin-Reprolysin Domain Structure. Swiss Institute of Bioinformatics. (Model). https://doi.org/10.5452/ma-c12zs
For further information on the correct citation of research data, please consult the DataCite website.
ETH library recommends to use a (1) non-commercial repository that (2) follows the FAIR data principles1 (making data findable, accessible, interoperable, and reusable) and (3) with data storage in Switzerland or the EU. Examples are the ETH Research Collection or the repository Zenodo. Other repositories (e.g. discipline-specific ones) can be found on www.re3data.org. A list of repositories recommended by the SNSF can be found on its webpage. Please check specific requirements for other funders.
For the sake of reusability of research data, it is indispensable to work with open data formats, which can be used without technical and legal constraints. Data formats should additionally be commonly used by others (generally or in a certain discipline) and well documented. In paragraph 1.3 on the Wiki subpage “File formats for archiving” we provide a non-exclusive list of recommended, generic data formats that are suitable for future reuse of your research data.
The FAIR1 criteria mainly concern issues of publishing (meta-)data in data repositories. In practice, it can be difficult to achieve perfect FAIRness of one’s research data. The following resources might help you in improving the FAIRness of your data: (1) The FAIR Awareness tool provides guidance to improve your awareness and compliance on FAIR issues. (2) The FAIR Implementation Profile allows you to assess the degree to which you are already complying with the FAIR criteria.
Dealing with sensitive data
Sensitive or confidential data are usually personal data “defined as all information relating to an identified or identifiable person. A person is identifiable if a third party having access to the data of the person is able to identify such person with reasonable effort.”1 Data can also be confidential, e.g., because they have to be protected from third-party access due to contractual agreements. If such data are used in a research project, the data management practice has to be adapted to deal with sensitive or confidential data in an appropriate way.
In general, dealing with personal and confidential data requires specific handling of data for processing, publication and archiving, since they have a specific legal status. Details about this can be found in the ETH document about Data Protection in Research Projects.
Moreover, when dealing with questions of sensitive and/or confidential data, we recommend you to contact firstname.lastname@example.org for consultation.
If data has been rendered anonymous, they have been treated “in such a way that the data subject can no longer be identified (and therefore is no longer personal data and thus outside of the scope of data protection law).”1
During the process of pseudonymization, personally identifiable information will be replaced by pseudonyms and the exact procedure will be documented with the aid of an identification key. Pseudonymized data and the identification key, which enables linkage to personal information, need to be divided from each other. The identification key ʺmust be kept separately and securely from processed data to ensure non-attribution.ʺ1
In general, all contractual rules that concern such data have to be complied with (e.g., defined in so-called ‘Data Transfer and Use Agreements’). Besides complying with contractual provisions, only those personal or confidential data that are anonymized can be stored in the password-secured ETH Data Archive and ETH’s long-term storage. In any case, non-anonymized data stored for the long term have to fulfill the same security requirements as during their active use. These requirements are usually defined in a contract (e.g., in a ‘Data Transfer and Use Agreement’). Encryption of non-anonymized, sensitive or confidential data is insufficient for archiving in ETH Data Archive. If you want to store or archive sensitive or confidential research data, please contact email@example.com for consultation.
Fundamental aspects of the Data Management Plan (DMP) and SNSF’s requirements
No, the content of the DMP will only be assessed by the administrative office of the SNSF for plausibility and compliance with its guidelines on open research data and it is therefore not part of the scientific evaluation process.
Regarding a project application for the SNSF, the DMP can be handed in as a first draft together with the research plan. Necessary changes in the first draft of the DMP can be introduced during the entire funding period. The SNSF Administrative Offices will assess a final version of the DMP together with the final scientific report at the end of the project.
The SNSF prefers the usage of non-commercial repositories and expects this as the standard. Nevertheless, if you provide plausible reasons for your selection of a commercial repository this might also be accepted. The SNSF does not cover costs that arise directly from the usage of a commercial data repository.
In case the research project is funded by the SNSF, it is mandatory to make at least data, which are the basis for a publication publicly available in compliance with ethical, legal and copyright guidelines. Many other funding agencies as well as an increasing number of scientific journals provide similar guidelines. At ETH Zurich the following applies: "Research Data and Programming Code are by default published at the same time as the associated results" (RDM Guidelines, Art 6 Par 1b, https://rechtssammlung.sp.ethz.ch/Dokumente/414.2en.pdf).
Specific requirements and infrastructure at ETH Zurich
“Research data and materials developed within the scope of research projects at ETH Zurich shall in principle remain at ETH Zurich […].”1
“The copyrights to works created within the scope of the professional activities (textbooks, scientific publication etc.) shall remain with the creator.”2
“Inventions by members of ETH Zurich’s staff created while pursuing their professional activities shall be owned by ETH Zurich, with the exception of inventions generated within the scope of research collaborations with third parties, wherein ownership will be determined by a separate agreement.”2 When in doubt please contact ETH transfer.
Software and Code
“In the case of software (computer programmes), ETH Zurich is entitled to the exclusive use and exploitation rights if the computer programme was created within the scope of professional activities and the performance of duties at ETH Zurich.”2
“This shall not affect contractual agreements of ETH Zurich with third parties within the scope of research collaborations.”2
“Computer programmes that were created within the scope of employment at ETH Zurich and are to be commercialised must be disclosed to ETH transfer in writing using the ‘Software disclosure’ form.”2
Software to be distributed under an open source license must be registered with ETH transfer via the ETH Data Archive. Please consider the handout which details all the necessary steps to do so. Please contact ETH transfer for further advise on open source software and software licensing.
1 ETH Zurich Guidelines on scientific integrity, RSETHZ 414 (as of 01.01.2022)
2 Guidelines for the Financial Exploitation of Research Results at ETH Zurich (Exploitation Guidelines), 16 December 2003 (status as of 1 January 2020)
“ETH Zurich requires its researchers to render all research papers, dissertations, habilitation theses and any other research results accessible via ETH Zurich’s repository Research Collection, provided there are no legal restrictions that prevent them from doing so.”1 To ensure reproducibility, reliability and accuracy of published research output, the relevant research data and key materials on which the results of a publication are based, should be shared via a FAIR data repository at the time of publication.2
An international legal instrument designated as “export control” regulates transboundary transport of items exported from Switzerland, which can be used for both civil and military aims (so-called dual-use items). In this case, the term “item” comprises not only material goods and equipment but also research data, software and technologies. Further information about this topic is available on the ETH Webpage: Export Control.
Anyone wishing to enter into a contract with business partners, whether for research purposes, cooperation, or for ordering goods or software, must first ensure that the contractual partner is not subject to trade embargoes or comparable restrictions in order to comply with legal embargo regulations. A sanction check can be carried out at ETH's own sanction search database.
In general, you can use cloud services for data storage. The project management is responsible for data management and it has to decide based on a risk assessment, whether a specific cloud service can be used.1 Respective data protection legislation applies from a legal perspective. Regulations that apply for cloud services that are subject to Swiss or EU law are more stringent than equivalent US services. The use of cloud services for personal or confidential data with storage outside of ETH Zurich is not permitted. For further information, please consult the leaflet of the ETH IT Services (accessed 1.3.2022).
1 Translated in English from the German leaflet: MERKBLATT für die Angehörigen der Departemente zu rechtlichen Aspekten des Cloud Computings (accessed 1.3.2022)
ETH researchers can publish their research data, which for instance form the basis of a publication, together with their research paper in the ETH Research Collection. Automated data export into the ETH Data Archive subsequently enables archiving for the long-term for at least 10 years. This is the regular way how to preserve research data within ETH Zurich. Both services are free of charge for ETH members of staff.
Please be aware of the specific guidelines that apply for confidential data and which you please extract from the FAQs about dealing with sensitive data.
ETH Zurich’s IT Services provide different types of storage depending on the purpose. One of them is the Long Term Storage (LTS) as internal storage for bigger data volumes, which are no longer in regular use. Data are securely stored, but no further measures will be taken to ensure their continued usability. Please contact the IT Support of your Department or Institute.
According to the ETH guidelines on integrity in research (RSETHZ 414 Article 11)1, a written agreement between the PI and the leaving employee needs to be reached for every research project. This agreement defines, which data and materials the former employee may continue to access and for which purposes they may use the data collected during employment at ETH. ETH legal services offers a template for such data sharing agreements when a PhD student or researcher is leaving ETH.