A customer would like to convert image files into a format that is suitable for long-term archiving and upload the files in ETH Data Archive. The original pictures are in JPG format. A collaborator is writing software routines in order to convert a large number of files into a format suitable for long-term archiving. The customer produced compressed PNG and TIF Files and sent those to the ETH Data Archive as an example.
Solution proposed by ETH Data Archive
(Status February 2017 – the information will not be updated)
For pictures, ETH Data Archive is recommending the uncompressed baseline TIF Format. TIF Format is the better solution for long term archiving then PNG format (see Empfehlungen der KOST).
We prefer uncompressed pictures, if the storage costs are justifiable. Usually, we recommend uncompressed data, as compression is an obstacle for long-term archiving.
We have tested the example files with three different validators. According to the validators KOST-Val and JHOVE, the file is an error-free TIFF 4.0 File. KOST-Val is known to be the more sensitive validator: The results of KOST-Val show that criterias of KOST for TIFF are fulfilled. Furthermore, the visual quality in case of zooming-in is identical with that of the original.
Our third validator, DPF Manager, has the advantage to give detailed and comprehensible information on errors (which is often not the case with JHOVE). DPF manager is providing the following Information:
The two red “error messages” indicate that the file violates the specifications of the Baseline TIFF format. (The following three information messages are less important.)
Based on this result by DPF Manager, the costumer produced a new file for which the resolution tag was manually set to 300 dpi. This way, the file was accepted as correct Baseline TIFF by DPF Manager. These uncompressed baseline TIFF files are well-suited for long term archiving.
Why did DPF Manager report an error, but both other validators did not? DPF manager had tested the TIFF file, as if it was a baseline TIFF 6.0 although it was a TIFF 4.0 File. Is it really justified to test a TIFF 4.0 with the baseline requirements of TIFF 6.0? I think the answer is "yes", as the TIFF 4.0 is already a bit out-of-date. The standard baseline TIFF 6.0 was created later due to problems with the diversity of TIFF files. (see https://en.wikipedia.org/wiki/TIFF#Part_1:_Baseline_TIFF).