1. Convert to PDF or to PDF/A-1b?

Word, OneNote or PowerPoint files are often converted to PDF format for delivery to digital archives. For archiving, we do not recommend the common PDF format, but the conformance level PDF/A-1b (an ISO standard), because it is more suitable for long-term preservation. You can recognize a PDF/A file by the blue bar at the top of the Adobe Acrobat file view (Figure below).

Figure: Adobe Acrobat displays a blue bar at the top of a PDF/A file.

Converting to PDF, instead of PDF/A-1b, is less sustainable, but sometimes avoids certain conversion errors (especially errors in formulas, special fonts, text spelling mistakes, comments, transparent objects, vector graphics and multiple character layers).

2. OneNote via Word to PDF

When exporting directly from MS OneNote to PDF/A-1b or PDF, those hyperlinks that are underlaid to the text are lost. To prevent this, you should first export the OneNote sections to MS Word. (Limit the OneNote entries to the page width first.) You can then convert the MS Word file to PDF or PDF/A-1b as described below.

3. How to select the conversion method

There are four methods for converting MS Word or PowerPoint to PDF/A-1b or PDF. They differ in computing time and PDF quality. For large and complex documents, a compromise must be made between the quality of the generated PDF and computing time (Table 1).

Table 1: Properties of the four conversion methods to PDF/A-1b or PDF (see appendix for more details)


Underlaid hyperlinks are lost

Underlaid hyperlinks are retained

Rapid Conversion

PDF Printer*

Save as PDF*

Slow Conversion


PDFMaker

Adobe Acrobat Pro

* References to footnotes no longer react to mouse clicks, but remain readable.

4. Description of the Conversion Methods

We describe here four methods to create PDF/A-1b (or PDF) files from MS Word or MS PowerPoint files. Only the method "save as PDF" does not require any paid software (Adobe Reader). The other three methods require licenced Adobe Acrobat products.[1]

You should visually check the quality of the generated PDF file. Pay particular attention to formulas, hyperlinks, special characters, special fonts, text spelling mistakes, selecting and searching in the text, comments, tables, colors, transparent objects, vector graphics and multiple layers in figures.

PDF Printer

For large scientific publications we recommend to explicitly specify URLs of hyperlinks in the text and to use the Adobe PDF printer (uses Adobe Distiller): In MS Word or MS PowerPoint under the menu item "File" select the button "Print". Under "Printer" select "Adobe PDF". (If the "Adobe PDF" field is missing, your computer probably only has Adobe Reader installed). If you are going to create a PDF/A-1b file, select "Printer properties". In the tab "Adobe PDF Settings" select "Default Settings" and choose the drop-down menu item "PDF/A-1b: 2005 (RGB)". Select "Print" to create the PDF file.[2]

 Save as PDF

A file can be saved in Word or PowerPoint in the PDF Format. To do this, select "Save as" from the "File" menu (not "Save as Adobe pdf"), then select "PDF (*.pdf)" under file type. Under "Options..." select "ISO19005-1-compatible PDF/A" to create a PDF/A-1b file (ISO19005-1 is PDF/A-1; the created file therefore fulfills at least PDF/A-1b).[3]

PDFMaker

A Word or PowerPoint file may be saved as PDF/A-1b using Acrobat PDFMaker. To do this, open the document in Word or PowerPoint, select the "Acrobat" tab. If you are going to convert to PDF/A-1b select “Conversion settings” and then PDF/A-1b (once for each installation). To create the PDF file you may either select “Create PDF” in the tab “Acrobat” or “Save as Adobe PDF” in the tab “File”.[4]

Adobe Acrobat Pro

Adobe Acrobat Pro or Adobe Acrobat Pro DC may be used for conversions. Start Adobe Acrobat Professional and open your Word or PowerPoint File from within Adobe Acrobat Professional. In the menu “File” you select “Save as …” and choose the desired file type PDF or PDF/A. To convert to PDF/A-1b select “More options” and “PDF/A-1b” with “sRGB”. [5]

Appendix: Our tests of the four conversion methods

To test the advantages and disadvantages of these four methods, five Word files and five PowerPoint files were converted to PDF/A-1b (MS Office Professional Plus 2010, Version 14.0). These test files contained elaborate scientific publications with figures, equations, and special characters. The produced PDF/A-1b files were evaluated as follows:

The preservation of tags or metadata was not assessed. Furthermore, preservation of comments below PowerPoint slides was not checked.

For large scientific papers without underlying hyperlinks, we recommend using the method with the PDF printer. With this method no differences to the originals were found by visual inspections, and all generated files were compliant with the PDF/A-1b standard. In addition, this method produced the smallest files and the computation time for these documents (up to 40 pages) did not exceed a few seconds (see table 2).

Table 1: Results for the four conversion methods

Criteria

PDF printer

Save as PDF

Acrobat PDFMaker

Adobe Acrobat Pro

Visual inspection

No errors

In 4 out of 10 files some figures were not acceptable, as transparent objects had become non-transparent.

One file could not be reliably converted

No errors, but partially unsufficient image quality. Furthermore, some text could neither be searched nor copied.

Links und references

Broken hyperlinks. Endnote links readable but do not react to mouse click

Endnote links readable but do not react to mouse click

No errors

No errors

Compliance with PDF/A-1b standard

No errors

One of the 10 files was not compliant with the PDF/A-1b standard.

Four of the 10 files were not compliant with the PDF/A-1b standard

No errors

Size of the PDF/A-1b files

Up to twice as large as original

Up to six times larger than original

Up to five times larger than original

Up to four times larger than original

Computing time (Intel Core i7, 3.4 GHz)

Up to several seconds

Up to several seconds

Up to several minutes

Up to several minutes



[1] Members of ETH Zurich can order Adobe Acrobat Pro DC from https://idesnx.ethz.ch Please contact your IT support. See also Adobe Acrobat XI : Product comparison, https://acrobat.adobe.com/ch/de/acrobat/pricing/compare-versions.html (access date Mai 28 2020).

[2] Further instructions can be found on the following website (access date Mai 28 2020): http://blogs.adobe.com/acrolaw/2007/01/pdfa_in_action. Acrobat Distiller converts from PostScript (*.ps) to PDF. You can also convert to PDF or PDF/A-1b with Distiller using the free OpenOffice 3 for Windows (Oracle).

[3] The following website gives further instructions (access date Mai 28 2020): https://support.office.com/en-us/article/Save-as-PDF-d85416c5-7d77-4fd6-a216-6f4bf7c7c110#bm11

[4] The following website gives further instructions (access date May 28, 2020): http://helpx.adobe.com/acrobat/using/creating-pdfs-pdfmaker-windows.html

[5] The following website gives further instructions (access date May 28, 2020): http://blogs.adobe.com/acrolaw/2011/05/using-save-as-to-to-conform-to-pdfa