""

First set of candidate veraPDF corpus files delivered

Duff Johnson // May 18, 2015

News


As the veraPDF project gets under way the project is generating the first test files for PDF/A-1, complementing the Isartor test suite.

Dual Lab, the veraPDF consortium’s lead developer, has loaded the first set of 49 candidate test files to the public veraPDF github repository.

The test files can be found at the veraPDF corpus for PDF/A-1b (under development) along with the wiki page describing the set.

All test files follow the pattern of the Isartor Test Suite:

  • naming convention refers to the corresponding subsection in ISO 19005-
  • they are all atomic
  • they are self-documented via PDF bookmarks

However, unlike Isartor, these files also contain “pass” tests.

There is one remarkable file to note:

6-1-12-t07-fail-a: Maximum number of Indirect objects (8,388,607) in PDF file is exceeded (the file is about 40Mb zipped)

Screenshot of File Being Repaired dialog.The document cross reference table contains more than maximum allowed number of records, violating PDF/A-1 implementation limits.

Warning: Be careful trying to validate this file in Adobe Acrobat! It will probably open after 30 seconds of thrashing, but it will hang on preflight checks.


ABOUT THE AUTHORS

Duff Johnson

Duff is ISO Project co-Leader for ISO 32000 (the PDF specification) and Project Leader for ISO 14289 (PDF/UA). He also serves as Executive Director of the PDF Association.

ABOUT THE AUTHORS

Duff Johnson

Duff is ISO Project co-Leader for ISO 32000 (the PDF specification) and Project Leader for ISO 14289 (PDF/UA). He also …

© 2019 Assosiation for Digital Document Standards e.V. | Privacy Policy | Imprint