Save the data

16 May 2018 By Dietrich von Seggern


In my opinion, the ability to attach files to PDF ranks very high in the (long) list of PDF features that have a lot of power but are not widely used. The use cases that I have in mind are not just to attach files as you would do with an email. Of course that is also possible - but the true power of this feature in PDF unfolds when standardized ways of attaching files come into play because this allows for interoperability of complex information chunks. Two examples have been developed in standards loving Germany in order to overcome the dilemma that has been reported as "PDF is where data dies".

Is PDF where data dies?

I am talking about "ZUGFeRD" for hybrid electronic invoices and "Drawing-free Product Documentation" from the German car manufacturer association. In ZUGFeRD the human readable invoice PDF is associated with a machine readable XML structure, which in turn is based on a standard for XML business transactions UN/CEFACT. In "Drawing-free Product Documentation" the PDF comes with a 3D model in JT (ISO 14306) plus some further information in an XML structure.

What do these standards have in common? Both build on PDF/A-3 because both are interested in long term preservation and reliability - but in the interoperability context of this blog they could as well just build on plain PDF 2.0 - since that has incorporated the "associated files" files feature that was defined in PDF/A-3. Both take advantage of the fact that PDF viewers are available on almost every device so that the complex data structures that they define can always be accessed - at least in "some" way. All PDF viewers allow for exporting any attachments, the compound document provides structured data plus context as defined in the PDF container. Finally both standards are in fact a combination of two base standards. That makes sure that more powerful applications can be developed specializing in processing the payload of the PDF. How powerful that can be is shown by the fact that in the case of "Drawing-free Product Documentation" this payload consists of more than just one file (JT and XML).

If you just briefly imagine the difference between receiving a compound document where the structured data is embedded and receiving them as two files e.g. attached to an email it becomes obvious that it is much more easy to process the compound document which makes sure that the context can always be reproduced by looking at the container PDF. So the compound document is much more useful than "sum of its ingredients". And that is why, we at callas software have always put whatever features we could think of into our products, mainly pdfaPilot, that make creation and processing of PDF file attachements as easy and straightforward as possible.


Back to overview