Javascript - Programmatically batch print HTML documents

Question

tl;dr I'm looking for a good way of batch printing database-stored HTML documents from javascript

Our users generate rich text content via an open source WYSIWYG javascript-based text editor (CKEditor). The HTML content is saved to our database and can be printed directly from the editor via its inbuilt print functionality (basically just window.print()) . This is great and works perfectly.

Now, we have a need to batch print saved documents and I am looking for workable solutions. There are various options that I can see, but all have large tradeoffs:

User selects documents to print. JS code loops through documents and calls print one-by-one. The issue here is that the user will see a bunch of print dialogs. Which is painful. (Aside: we are using Chrome but I do not have the option of setting it to kiosk mode)
User selects documents to print. JS code combines all of these in a single (hidden) container and they are all printed as one 'document'. These can be fairly big documents with tables, images, etc. I worry about the performance involved here as we could be adding a significant amount to the DOM.
Similar to #2 above, but at some point the documents are converted and saved to a single PDF. This would be OK, but there don't seem to be many good/cost-effective options for converting HTML to PDF.
Generate some kind of a report that can handle HTML content. I took a look at SQL Server reporting services but it supports a very limited set of HTML tags and CSS properties.

Is there a better way to batch print HTML content from javascript? Any help is very much appreciated!

Edit As per @Paul, I need to clarify a few points:

The content is that which is created in your standard online text editor. In my case:

No iframes
No animations
No dynamic content

Now, were I to print straight from the editor a print stylesheet would be applied, so this may complicate things a bit.

something like this? http://stackoverflow.com/a/13724670/2418529 — Nico, Feb 16 '17 at 18:51
@NicolòCozzani This will suffer from the large DOM issue in #2. My concern is not opening a new window or not, it's a new Chrome print dialog or not. Thanks! — JP., Feb 16 '17 at 19:16
I'm fine with people down voting, but would appreciate feedback to improve future questions. This seems like a reasonable on-topic question to me. — JP., Feb 16 '17 at 19:51
The question is not clear because HTML can include inserted content, like ``, dynamic content, like CSS animation, or content loaded or generated via JavaScript from `<script>` tags. It is not specified whether the conversion must handle all of these elements of HTML or merely a subset. The question also doesn't specify what has been tried or what is wrong with the way it is currently done beyond stating that manual approaches or multiple print dialogues are painful. For instance, have you tried an automated browser like selenium or phantomJS ?</script> — Paul, Feb 19 '17 at 03:20
Is requirement to print single `.pdf` or multiple `.pdf` documents? — guest271314, Feb 20 '17 at 07:21
@Paul I will add an edit, addressing your first point. It was a good one and I appreciate the feedback. WRT the second, I have not tried anything yet. I am trying to determine which path to go down and figured I'd harvest the community's wisdom rather than trying to reinvent the wheel. not a good answer, but an honest one... — JP., Feb 23 '17 at 03:48
Client side you could get and combine all of the `html` `document`s using `fetch()` or `XMLHttpRequest()` within `Worker`, then post the combined `html` to main thread as an `ArrayBuffer`, create a `Blob URL` of combined `html`, call `print()` once. — guest271314, Feb 24 '17 at 03:56
@NineBerry Right now, ASP.NET webapi as a passthrough to a SQL Server database. However, happy to use node instead. — JP., Feb 24 '17 at 21:39
Then I'd suggest using server side asp.net to create a combined HTML or PDF version of the documents, open this as a new tab in the browser — NineBerry, Feb 24 '17 at 21:44

score 7 · Answer 1 · answered Feb 19 '17 at 16:28

7

Since content could be potentially large and consume a lot of memory I would do this on server side. Select docs on client and request server to render those to PDFs e.g. utilising PhantomJS. This would then allow you to even use mobile clients to fetch PDFs.

answered Feb 19 '17 at 16:28

Janne

1,665
15
22

This is the approach I'd take too – Soubhik Mondal Feb 20 '17 at 08:39
This makes a lot of sense, thank you. Can you suggest any specific libraries to achieve this with PhantomJS? – JP. Feb 23 '17 at 03:46
1

@JP. Since you have the html, you could use nodejs combined with [phantom](https://github.com/amir20/phantomjs-node) to just display the html, See the [screen capture](http://phantomjs.org/screen-capture.html) functionality on how you'd print it out as a pdf. – matt Feb 23 '17 at 04:05
1

@Janne I'm awarding you the bounty as yours seems to be the most widely accepted answer. I'm going to keep the question unanswered until I put a solution in place, in the hope of receiving more answers. Thanks to all for their suggestions! – JP. Feb 24 '17 at 21:32
1

I have seen good results using the RichEdit functionality from DevExpress to convert HTML documents to PDF in a web server application in asp.net – NineBerry Feb 24 '17 at 21:50

score 0 · Answer 2 · answered Feb 23 '17 at 03:57

I completely agree with the answer above, PhantomJS would probably be the best option. The only problem with this is in terms of reliability PhantomJS has been pretty touch and go over the last few versions. If the size of the documents become too large it may become too much for Phantom to handle (remember it was originally designed for web testing purposes, and morphed into web automation). When writing the script for this, I would suggest following the outline below (to break up the processes into more manageable steps)

    var steps = [
  function() {
    // step 1
  },
  function() {
    // step 2
  }
]

Again, it's not a perfect option overall, but it is the best one we have to work with for now. If you have any questions feel free to reach out, I'm working on web automation myself so this will all be fresh in my mind.

Download for PhantomJS Here

Javascript - Programmatically batch print HTML documents

2 Answers2