• The Research Collection is the Institutional Repository of ETH Zurich and a free service for ETH members.
  • It is aligned with the FAIR data principles according to SNSF guidelines.
  • The repository is accepted for publishing supplemental material by renowned scientific journals.
  • It supports linking the data to other publications in the Research Collection (Journal Articles, Conference Papers, ...).
  • It supports Web upload, DOI-reservation and registration, ORCID iDs and export to OpenAire.
  • Entries are preserved long term in the ETH Data Archive (except for data deposited via libdrive).
  • All types of research data can be published in the Research Collection (taking into account any ethical and legal restrictions).
  • The Research Collection offers the following publication types for research data: Data Collection, Dataset, Image, Model, Software, Sound, Video and Other Research Data.
  • We do not recommend uploading files larger than 10 GB in the Research Collection. Each entry shouldn't exceed a total of 50 GB. Several entries can be linked to a parent Data Collection. Larger datasets can be published using our service libdrive.

How to prepare your research data for publication


  • We recommend choosing a Creative commons licence. These licences allow authors / data producers to define what types of reuse are permitted for their works. Data published without a licence can only be reused with explicit permission from the copyright holder or based on an exemption in national copyright law. Works published with a CC license can be reused as specified in the license.
  • For help in choosing a Creative Commons license check the page Creative Commons Licenses.
  • Creative Commons Licences are inappropriate for Software and open source licences are unsuited for research data. For licensing software visit "Licensing software and programming code/scripts"
  • Include a README file to document your data. A guide can be found here: https://documentation.library.ethz.ch/x/bQBIB
  • Use meaningful file and folder names
  • Include metadata
  • Link your item to an article or other publication (example)
    Example of linked item
  • Remove temporary and backup files
  • Remove duplicate files
  • Remove personal information
  • Rename files and folders where helpful (meaningful names)
    • Avoid overly long folder and file names. Total path lengths >200 characters (files and folders combined) can lead to problems for windows users 
  • Remove third party files and software for which you don‘t have permission
  • Check for hardcoded file paths, symbolic links, references
  • Don‘t include your manuscript: Publishers PDFs, Preprints and postprints (Author's Accepted Manuscripts) should be published as a separate entry. See Open Access articles (self-archiving)
  • Spell check your text files
  • File extensions should be consistent with file formats 

  • Avoid special characters in names of files and folders. These characters hamper compatibility because they lead to undesired effects depending on the operating system
    • Avoid the following characters:
      • \ / ? : * " > < | : # % " { } | ^ [ ] ` ~ as well as blanks
      • Non ASCII characters such as ¢ ™ ® , umlauts (ä ö ü), diacritics such as à é ô etc.
    • The following ASCII characters are permitted:
      • ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789
    • We are currently not aware of problems with the folllowing characters:
      • ! $ & ' ( ) + , - . ; = @ _ 
  • Choose open, well documented standards from your domain
  • For long term preservation (> 15 years) only formats from our list of recommended file formats should be used. Other Formats should be converted. Consult our list of File formats for archiving research data
  • When non-recommended formats are used, long term readability cannot be guaranteed
  • Upload single files directly
  • Pack file collections containing a large number of files or subfolders into a container
    • Use standard container formats: Zip or TAR-files (avoid .7z, tar.gz, .rar, and so on)
    • Use preferably uncompressed containers (compression level “store”)
    • Do not use archives within your ZIP or tar files.
    • Don't use encryption or password protection
  • Large datasets can lead to problems with up- and downloads of your dataset. Limit single files to around 10 GB for a limit of 50 GB per entry.
    • Please split larger folder structures manually into meaningful subunits and package them separately.
    • Don't use the automatic split features of your software. 
    • Datasets larger than 50 GB can be published using our service libdrive.

Instructions for Windows

On Windows operating systems, you should create ZIP archives by using the software tool 7-Zip.

You start by selecting your files and folders in Windows. Then, click your right mouse button to open a menu. Select the software tool “7-Zip”, and “Add to archive …” A dialog box opens as shown below. In the white field on top, you write the name of your archive file. Use the option “zip” and the compression level “store”.

Intructions for macOS

On a Macintosh computer, you should create tar archives.

You may create tar container files either by using the command line (tar -cvf <archive_name.tar> <folder_to_tar>) or by using the software Keka. If you choose the second option, you start the program Keka and select “Compression” in “Preferences”. You then select the default format “TAR” as in the dialog box shown below. You finally drag your folder onto the Keka icon and fill in the name of your archival file. You should select the option "Exclude Mac resource forks (e.g. .DS_Store)" in the Keka settings. This prevents the inclusion of macOS specific hidden files. 

How to upload your data in the Research Collection

  • If you would like to reserve a DOI before uploading your data follow the instructions on Reserving a DOI
  • The Research Collection offers different access rights for research data items: Open Access, Closed Access, Embargoed Access, ETHZ Users only, Selected Users only and Closed access. For more information check our page Access rights
  • After you finish your submission your item is still in a review state: Metadata is visible but files are not accessible
  • Within a few work days the data is reviewed by our team.
    • We check and supplement the entries metadata
    • We identify and add the used formats to the metadata.
    • We inform users of issues with long term preservation (depending on the selected retention period)
    • We perform a cursory check for obvious legal issues (copyrighted material, software licenses) → Please note, however, that the responsibility for compliance with the terms of use and legal requirements lies with the submitting person.
    • There is no content review
    • After the review you will be informed of the finalisation of your item
    • The DOI will be registered overnight

  • Research output from cooperation projects can be deposited in the Research Collection independent of where it was produced, as long as an ETH group or institute takes over the responsibility for obtaining the publication rights from the data producers.
  • Please consider the following when uploading the data:
    • Field Organisational Unit: Data that was not produced at ETH must use the organisational unit of the ETH group or institute that was part of the cooperation.
    • Field ETH Publication: Data that was not produced at ETH Zurich must be marked with a "No".
    • Field Project/Grant: Please choose the ID/name of the cooperation project / grant, in order to prevent unnecessary enquiries from our staff about the origin of the data.
  • The ETH Library forwards access request to the rights holder of a dataset - even if he/she is no longer affiliated with ETH Zurich - under the condition that the rights holder has an up-to-date ORCID record and his/her ORCID iD has been assigned to the corresponding author record in the Research Collection.
  • According to the Guidelines for Research Data Management at ETH Zurich, Research Group Leaders are responsible for "taking decisions about access rights for unpublished data as well as for group members leaving ETH Zurich". ETH Library therefore forwards access request to datasets where the rights holder does not have an up-to-date ORCID record and/or does respond to requests to their former Research Group Leader.
  • In addition, you might want to consider assigning the task of a “data steward” to someone in your group. This person can approve access requests for data (even if the data producer has already left the group) and edit data that has not yet been published. If you would like to determine a data steward for your group, please contact research-collection@library.ethz.ch.

If you use openBIS to manage your research data, you can export data directly from openBIS to the Research Collection for publication. For more information, please refer to the openBIS documentation.

  • No labels