pdfChip specific JavaScript

In its early days JavaScript inside HTML content has mostly been used for creation of effects. Over time it became a full fledged programming language, even supporting object oriented programming. Todays rich interactive websites are not thinkable without JavaScript. And driven by the interest in making websites more interesting and interactive, the developers behind the JavaScript engine in WebKit have invested a lot of effort to make it very performant.

This can be taken advantage of in callas pdfChip. Whether information is to be retrieved from whatever web service, or whether decision about the content to be encoded is to be made on the basis of whatever source of data – it can be done, and it can be done very efficiently. This chapter contains full information on the specific JavaScript functionality added by pdfChip and how you can take advantage of it.

"Normal" HTML JavaScript

Because pdfChip is based on the WebKit engine (currently we are using ECMA Script version 5), it fully supports - even advanced - JavaScript. Anything that works in a normal browser will also work during a conversion with pdfChip. Of course there are features that are offered by the browser itself (such as the "Window" object) that won't work in pdfChip because there is no such object during the conversion pdfChip does.

The following are a few popular JavaScript libraries that have been tested using pdfKit. This doesn't mean that you are limited to those; it simply shows off some of the possibilities available to you.

  • jQuery: a small, lightweight and versatile JavaScript library that is mainly interesting in a pdfChip context for its HTML dom traversal and manipulation API.
  • MathJax: a very complete and easy to use JavaScript library to render formulas in MathML.
  • Hypenathor: a hyphenation library that can supplement the lack of (good) hyphenation in standard CSS.
  • Polyfill libraries: are JavaScript libraries used to implement specific CSS features not or not very well implemented by browsers. Many such polyfill libraries exist to plug holes that exist in WebKit for specific advanced CSS features.

Modifying the print loop

The purpose of pdfChip is to convert HTML into good PDF; often use cases will need to modify the given HTML template and alter the appearance of a single page or multiple pages throughout the generated PDF document. To support this pdfChip implements a number of custom Javascript functions and objects that are introduced in this section. Full information about the functions and objects used is available in the following sections.

Use in one-pass conversions

pdfChip defines a printLoop and printPages function to let you take full control over how and when pages are output. This lets you modify (for example) a single-page HTML template and output as many pages as you want:

function cchipPrintLoop() {
    
        for (var theIndex = 0; theIndex < 10; theIndex++) {
        
            $('#test').text('penguins');
            cchip.printPages();
        }
    } 

As soon as you include a JavaScript file into your HTML template that defines the above printLoop function, pdfChip will automatically execute it for you. This simple example function interates 10 times; each time it modifies a paragraph using a jQuery statement and then uses cchip.printPages to convert the HTML template as it is at that point in time to PDF pages.

When using cchipPrintLoop in this fashion, you still only end up with one output PDF file, even if you call printPages multiple times. pdfChip always appends the output from printPages to the same (single) output PDF file.

Use in multiple-pass conversions

When using overlays or underlays, the same technique is still usable. Of course uderlays and overlays have a different HTML template and thus will also use different JavaScript files, which allows giving an overlay or underlay an adjusted print loop:

function cchipPrintLoop() {
    
        for (var theIndex = 0; theIndex < cchip.pages.length; theIndex++) {
        
            $('#test').text('penguins');
            cchip.printPages();
        }
    } 

The above example for an under- or overlay is virtually identical to the one-pass example with one important change. The number of iterations is now determined by cchip.pages.length. This cchip object is added by pdfChip to give you access to information from the main HTML template. In this example it's used to generate an under- or overlay with the same number of pages as what was generated by the conversion of the original HTML template.

Reference

This section contains reference information for all pdfChip specific JavaScript functions and objects.

cchipPrintLoop

function cchipPrintLoop()

If the HTML document contains a printLoop function (either embedded in the HTML file or in a separately included JavaScript file), this modifies how pdfChip generates its output PDF file. No PDF creation is done automatically, instead pdfChip relies on the printPages function to be used to output any PDF pages as necessary.

This means that the body of the printLoop function should be used to alter the HTML template as necessary and that the modified HTML DOM should be output by invoking the printPages function. Note that printPages can be invoked multiple times and if so that the result of these multiple invokations will be merged into one output PDF file.

Example:

function cchipPrintLoop() {
    
        for (var theIndex = 0; theIndex < 10; theIndex++) {
        
            $('#test').text('penguins');
            cchip.printPages();
        }
    } 

'cchipPrintLoop' is not guaranteed to wait for resources added dynamically by Javascript code.

This can be handled:

  • automatically using 'cchip.onPrintReady' method,
  • manually using 'cchip.beginPrinting' and 'cchip.endPrinting' methods.

cchip

During conversion of the main HTML file the cchip object is extended by properties that hold information about the converted document. This information can be used from within the HTML template for an overlay or underlay.

cchip.printPages

function cchip.printPages()

Outputs the current HTML DOM to the PDF output file. Can be invoked multiple times, but can only be invoked from the body of the printLoop function.

Example:

function cchipPrintLoop() {

        cchip.printPages();

    } 

If used with no parameters the current HTML DOM will be output. The number of pages can be limited by specifying the required number as parameter to the cchipPrintLoop. In this case only one page will be output:

function cchipPrintLoop() {
    
        cchip.printPages(1);
    } 

cchip.beginPrinting(), cchip.endPrinting()

If 'cchip.beginPrinting' is called, conversion is not finished until matching 'cchip.endPrinting' is called.

If 'cchip.beginPrinting' is called multiple times then 'cchip.endPrinting' should be called multiple times as well.

If 'cchip.beginPrinting' or 'cchip.onPrintReady' is not called conversion is finished just after 'cchipPrintLoop()' method is executed.

Please note that these methods should be used if printing should happen after some specific JS event. Example: If MathJax is used, printing should happen only after MathJax finished all its work. This can be done in the following way:

function doPrinting() {
    cchip.printPages();
    cchip.endPrinting();
}
function cchipPrintLoop() {
    cchip.beginPrinting();
    MathJax.Hub.Queue(doPrinting);
}

Without 'begin/endPrinting()" calls pdfChip will exit before 'doPrinting' method executed and no output PDF will be created.

cchip.onPrintReady( callback )

cchip.onPrintReady( f ) installs a callback function f() that is called when the DOM is ready for printing, e.g. all images are loaded. The normal way to use this function is to first manipulate the DOM, then call cchip.onPrintReady( f ) that calls f() when the DOM is ready and exit the cchipPrintLoop(). The function f() must call cchip.printPages() in order to actually create PDF pages from the DOM and initiate further DOM manipulations and printing if required.

The following example illustrates how this function can be used.

<html>
<head>
<script>
function cchipPrintLoop(){
var img = document.getElementById("myimg");
img.src = "files/image.jpg";
cchip.onPrintReady( cchip.printPages );
}
</script>
</head>
<body>
<img id="myimg" src="files/2.png">
</body>
</html>

The cchipPrintLoop() function is used to place an image (image. jpg) into the DOM. Instead of directly calling cchip.printPages it calls the cchip.onPrintReady function that installs cchip.printPages as a callback function which makes sure that it will only be used after all images have been loaded.

cchip.dumpStaticHtml()

cchip.dumpStaticHtml() function writes current HTML DOM state to HTML file.
HTML <script> tags are removed during write process. Example, for input "in.html" and output "out.pdf" the following code:

function cchipPrintLoop() 
{
    cchip.dumpStaticHtml();
    cchip.printPages();
}

will produce dump HTML file on the path "dump-static-html-2020-05-15--19-24-15/in-000-out-000-js-000.html"

cchip.log

function cchip.log( inTextToLog )

This function logs any string pass to it to stderr during conversion of the HTML template.

Example:

function cchipPrintLoop() {

    cchip.log("Printing first page of DOM);
    cchip.printPages(1);

  } 

cchip.urls

An array containing the URLs of all HTML files being converted. Overlays and underlays are not included here. If pdfChip is called with a single HTML file, this list will contain only one element; if pdfChip receives multiple HTML files on its command-line, all of the main HTML files will be available in this list.

cchip.overlays

An array containing the URLs for all overlay HTML files used during the conversion.

cchip.underlays

An array containing the URLs for all underlay HTML files used during the conversion.

cchip.versionString

Version of the pdfChip executable, e.g. "2.2.066". For 64bit application " (x64)" string appended to version, e.g. "2.2.066 (x64)".

cchip.pages

An array containing information about the individual pages resulting from the conversion of the main HTML template into a PDF document. The different properties of the page elements in this array contain information about the pages. Specifically the following properties can be used:

  • number
    The (zero-based) page number of the page.
  • mediabox
    Information on the mediabox for the page using a height, width, bottom and left property. All properties are expressed in points.
  • cropbox
    Information on the cropbox for the page using a bottom, left, top and right property. All properties are expressed in points.
  • trimbox
    Information on the trimbox for the page using a bottom, left, top and right property. All properties are expressed in points.
  • bleedbox
    Information on the bleedbox for the page using a bottom, left, top and right property. All properties are expressed in points.
  • margins
    Information on the margins for the page using a bottom, left, top and right property. All properties are expressed in points.
  • h
    An array with information for the content (text) of the currently active headers for this page. Because the array is zero based, cchip.pages[theIndex].h[0] returns the content of the current h1 header level.

Example:

for ( var i=0; i < cchip.pages.length; ++i ) {
  var page = cchip.pages[i];
  if (page.cropbox)
      cchip.log( "cropbox: " + page.cropbox.left + ' ' + page.cropbox.bottom + ' ' + page.cropbox.width + ' ' + page.cropbox.height )
}