You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »


The following recommendations apply when files are uploaded manually using the web interface of the ETH Data Archive (

If this web upload does not meet your need, please do not hesitate to contact us to discuss further options.

The first section of this document explains how you may prepare your files and folders to ensure long-term readability of your data.

For archiving large collections of heterogeneous research data sets over a limited time period we currently recommend to pack the data into container formats. The second section of the current document explains how to create ZIP- or tar-containers and recommends suitable tools.

1. Data preparation

Data selection

We recommend to carefully select the data, such that the archived data is of scientific relevance and worth archiving over the long term. Please remove unneeded data and avoid storing identical files in several places, such as storing ZIP-Files and their unzipped contents, multiple backups or temporary files. Private information does not belong into the ETH Data Archive.

Choose open formats

To allow for long-term readability of your files, non-proprietary file formats that follow open and properly-documented standards should be preferred. If you plan to archive your data for more than 10 years, it is recommended to convert unusual file formats into more popular formats. Please consult the fact sheet File Formats for Archiving for further information on this topic.

Avoid special characters

Avoid special characters in names of files and folders. These characters hamper compatibility because they lead to undesired effects that depend on the operating system.

Avoid the following characters:

• \ / ? : * " > < |

These characters are not allowed in Windows file names. If a folder is unpacked by WinZip, these characters are usually replaced by underscores.

• Non-ASCII characters, such as ¢ ™ ® ä ö ü à é ô and other characters with diacritics

If files are packed with WinZip, these files are moved to locations outside of their original folder due to a flaw in Linux.


The following ASCII characters are permitted:

!#$%&'()+,-.0123456789;=@ABCDEFGHIJKLMNOPQRSTUVWXYZ[]^_` abcdefghijklmnopqrstuvwxyz{}~

  • No labels