Quick Check configuration syntax
Quick Check configuration is done by using one or several filter expressions. Any filter expression has a certain specificity and either includes or excludes the respective sub data structure or data field. Filter expressions are interpreted against the structure of possible Quick Check output (for more details see the next articles for the syntax of Quick Check Output). Filtering works for the following areas:
- "aggregated"
- "direct"
- "status" (this data structure will always be written and cannot be turned off)
The next 3 articles explain each of these 3 data structures.
Example: Configuration syntax
The filter expressions are based on the 'path' to a given data (sub)structure or data element in the Quick Check output structure. Assuming the following output excerpt:
"aggregated": {
"pages": {
"page" : [
{
"geometry" : {
"TrimBox" : {
"left" : 0,
"bottom" : 0,
"right" : 612,
"top" : 792,
"width" : 612,
"height" : 792,
"width_eff" : 612,
"height_eff" : 792
}
}
},
{
"geometry" : {
"TrimBox" : {
"left" : 0,
"bottom" : 0,
"right" : 612,
"top" : 792,
"width" : 612,
"height" : 792,
"width_eff" : 612,
"height_eff" : 792
}
}
}
]
}
}
and assuming that only the effective width width_eff
and effective height height_eff
data elements of the TrimBox of each page are of interest, the following filtering expression would be used:
$.direct: false
$.aggregated: false
$.aggregated.pages.page.geometry.TrimBox.width_eff : true
$.aggregated.pages.page.geometry.TrimBox.height_eff : true
- The $ element serves as a kind of virtual root element, to which the path to the data structure or element of interest is appended.
- In order to not trigger creation output of massive amounts of data it is recommended to completely turn off the top most level of the two main areas, using
$.direct: false
and$.aggregated: false
. - Next, filter expressions that target the exact data substructure or element that are to be included in the output are added – in this example just the two entries inside the TrimBox data structures for each page.
Filtering for all pages, a specific page or consolidated data from all pages
The Quick Check filter expressions make it possible to request page related data either for a specific page (or range of pages), all pages, or in a consolidated form aggregating information from all pages. The page index is one based. This means that a particular page can be addressed using the usual array notation (the first element starts at 1). The following expressions illustrate this:
$.aggregated.resources.color.spotcolors: true
Output: summary of all spotcolors used in the document.
$.aggregated.pages.page.resources.color.spotcolors
: true
Output: all spotcolors per page for the hole document.
$.aggregated.pages.page[2].resources.color.spotcolors: true
Output: only the spotcolors used on page 2.
$.aggregated.pages.page[-1].resources.color.spotcolors: true
Output: only the spotcolors used on the last page.
$.aggregated.pages.page[2 4].resources.color.spotcolors: true
Output: all spotcolors per page for page 2, 3 and 4.
For all filter expressions that start with $.aggregated.pages.page.
a page index can be used. Otherwise they will be summarised. In the sample above the page index is: [2]
, [-1]
& [2 4]
.
For $.aggregated.pages.page
the following substructures are available:
-
geometry
: various data elements based on page geometry boxes -
info
: data elements for sequential page number (starting at 1 for first page) and page label -
resources
: data structures and elements for color and font resources -
annotations
: data elements regarding annotations and their properties -
contentstream
: data elements based on the size of the content stream (bytes) -
pieceinfo
: data elements regarding metadata for application-specific data
Set a timeout and reflect it in the Quick Check(JSON) output
A document with millions of pages is not unusual in production print which is a typical environment for QuickCheck. A timeout can be set within the QuickCheck configuration with the below parameters:
$.settings.timeout_is_error: true
$.settings.timeout: *number of seconds*
-
$.settings.timeout_is_error
: A timeout will appear as an error in the 'status'; default = false -
$.settings.timeout
: Initiates a timeout after ... seconds; default = -1 (off)
Example:
For the below QuickCheck call:
/pdfToolbox --quickcheck /Downloads/Quickcheck-timeout.cfg /Downloads/Large.PDF
and the following configuration 'Quickcheck-timeout.cfg':
$.direct: false
$.direct.Info: true
$.aggregated: true
$.settings.timeout_is_error: true
$.settings.timeout: 1
$.settings.status: true
The QuickCheck status looks like this:
..omitted
},
"status": {
"time_needed_sec" : 1.011101,
"result" : "incomplete",
"number_of_pages" : 984,
"abort_at_page" : 515,
"level" : "errors",
"returncode" : 113,
"errors" : [ { "code" : 113, "msg" : "Conversion did not finish during user specified timeout." } ]
}
Hints and tricks
If a dot (.) or colon (:) occurs in a filter path identifier, then this glyph must be escaped with a preceding backslash (\), e.g..:
- $.direct.Root.Private.Test\:2\:Colon : true
- $.direct.Root.Private.Test\.2\.Points : true