PDF and epub export of Knowledgeblog articles

The case for export options

Whilst the web version of a Knowledgeblog article is considered canonical, there is a strong argument for allowing the export of articles in other formats. PDF and EPUB formats are considered to be of particular importance. PDF’s allow researchers to maintain their normal workflow of downloading and archiving articles of interest into an existing document manager such as Papers or Mendeley. The EPUB format is gaining wide acceptance due to the emerging market for e-book readers.

PDF export from WordPress

A number of plugins can export PDF renderings of WordPress posts, we looked at two closely for release on Knowledgeblog.

WP Post to PDF by Neerav Bobaria

The plugin has the following features:

PDF Creation Station by Kalin Ringkvist

The plugin has the following features:

It’s arguable that in the first instance PDF Creation Station offers the cleanest output. It can be modified to include appropriate branding. WP Post to PDF is limited in a multi-site installation as the branding can only be applied at the plugin level, and all sites on a single WP3 installation will have the same branding graphically. This could be easily modified to produce branding on a per-blog basis, by allowing a custom field to point to a previously uploaded image.

Although both plugins use the same underlying PHP PDF generator (TCPDF) there were a number of differences with the resulting PDF exports. The Knowledgeblog Mathjax-LaTeX plugin processed shortcodes are not processed, althouth this was expected. Testing with other posts that utilised the KCite plugin and the Post Revision Display plugin, showed that only WP Post to PDF processed the additional post content generated by the plugins and also was able to correctly parse the shortcodes in KCite. Consequently the recommendation is that WP Post to PDF is the renderer of choice for Knowledgeblog articles.

EPUB export from WordPress

A number of groups are now coalescing around WordPress as a platform for scientific publishing thanks to discussion at meetings such as Beyond the PDF and the WordPress for Scientists Google group.

Martin Fenner (blog, Twitter) has recently released an ePub WordPress plugin (see blog announcement on PLoS BLOGS).

This was installed on a test site to see how much of a Knowledgeblog article is captured in terms of formatting. The plugin compares very favourably to WP Post to PDF in terms of processing the full range of a Knowledgeblog article, including KCite shortcodes, bibliography and Post Revision Display information. As expected MathJax-LaTeX shortcodes are not processed to graphics in the output. It should also be noted that the epub files are only created when a post is created or updated, so it cannot be retrospectively applied to posts that exist prior to the plugin installation. This is not an issue for the PDF plugins listed.

Testing did reveal that the EPUB format output, in terms of readability, depends very much on the EPUB format reader. Using the free Stanza reader for Mac OS X produced on-screen output of poor quality. Reading the same EPUB exported Knowledgeblog article read on Adobe Digital Editions produced an extremely high quality on-screen read.

In Summary

PDF and EPUB export of Knowledgeblog articles allow content to be filed, archived and integrated into a researchers existing workflow for handling publications (e.g. via Mendeley, Papers, Cite-U-Like or Connotea). Additionally content can be propogated to other devices for reading offline. However it must be reiterated that as a Knowledgeblog is potentially subject to changes after publication (if the Knowledgeblog is Public Review for instance) any item downloaded may not, in the future, accurately reflect the online article.