0

How can I convert a whole HTML page with JS-drawn canvases to an editable format on a server?

My web app generates pages with some basic HTML + CSS and with Flot charts (which basically draw on canvas HTML tag with JavaScript). I can convert them to PDF with wkhtmltopdf and it works great, but I am having problems finding a way to convert them to a format which would be recognized by MS Word, OpenOffice Writer and/or LibreOffice Writer.

P.S.: I have seen this question, but it applies to raw HTML only (without JS).

Community
  • 1
  • 1
johndodo
  • 17,247
  • 15
  • 96
  • 113
  • The contents drawn on a canvas are typically some sort of graphics. Why don't you export it to a graphics format instead of some office file type? – feeela Sep 12 '13 at 07:58
  • 1
    possible duplicate of [Capture HTML Canvas as gif/jpg/png/pdf?](http://stackoverflow.com/questions/923885/capture-html-canvas-as-gif-jpg-png-pdf) – feeela Sep 12 '13 at 07:59
  • @feeela: not really a duplicate - the original question is about exporting the whole page, not just canvas. Thanks for the suggestion though, I had the same idea, but it is difficult to implement in this case (I am doing conversion server-side) so I would rather use a conversion tool. I will update the question to reflect that. – johndodo Sep 12 '13 at 08:03
  • Well, on the other hand PDF is an editable format – you just need a proper software like Acrobat – feeela Sep 12 '13 at 08:05
  • Yes, but it is not editable by MS Word and/or OO Writer (technically, PDF can be imported to OO Draw and edited there, but this is not enough in this case). Thanks for the suggestion though, converting PDF to DOC/RTF/ODT through command-line would be ok too, I just can't find a suitable utility. – johndodo Sep 12 '13 at 08:09
  • @feeela: I would appreciate if you removed "possible duplicate" flag (unless you feel it is still warranted, of course). This question is much more generic, HTML canvas is just part of the problem. Thanks! – johndodo Sep 12 '13 at 08:18
  • fun fact: if you save an html file with a .doc extension, it opens in word as expected. – dandavis Sep 16 '13 at 16:35
  • This is indeed one of the best way to convert HTML to a Word document - set the appropriate headers on the server and the file will open in Word directly. In this case however there is JavaScript to be run and Word doesn't do that. – johndodo Sep 18 '13 at 06:45

3 Answers3

2

You can try the phpword library.

The major features include:

Major Features

-Insert and format document sections -Insert and format Text elements -Insert Text breaks -Insert Page breaks -Insert and format Images and binary OLE-Objects -Insert and format watermarks (new) -Insert Header / Footer -Insert and format Tables -Insert native Titles and Table-of-contents -Insert and format List elements -Insert and format hyperlinks -Very simple template system (new)

Since it support format Images and binary objects, it means you can also transfer your html chart to word document.

Here is the detailed documentation on how to use the library.

Hope this can solve your issue. Good luck!

joydesigner
  • 813
  • 5
  • 11
1

quick search on google gave me this http://www.phpdocx.com/ Never used it so I don't know how well it works. In the docs you can read that it is able to embed image tags in the final doc file. http://www.phpdocx.com/documentation/html-to-word-PHP

I don't know if phpdocx can deal with html5 canvas element but if not you could always convert the canvas element to simple image tags with the help of the toDataURL function.

kasper Taeymans
  • 6,950
  • 5
  • 32
  • 51
  • 1
    Thanks, seems like a good library - the license is a bit limiting in our case though (not to mention expensive :). – johndodo Sep 18 '13 at 06:48
1

Thanks to all for the answers! In the end I made it work with PhantomJS (custom script converts JS generated parts to images) and Pandoc (which converts the resulting HTML to DOCX).

johndodo
  • 17,247
  • 15
  • 96
  • 113
  • Do you have an example of PhantomJS and Pandoc usage for your question ? I know this is old – gogaz Sep 29 '17 at 08:36