pagedown::chrome_print() extremely slow for some xaringan html slides

Question

I'm creating pdf versions of html slides generated from R markdown files. I've come across a puzzling behaviour. When I run pagedown::chrome_print on the html with output specified as xaringan::moon_reader, the operation fails with the timeout message:

Error in force(expr) : Failed to generate output in 30 seconds (timeout)

Here is an example of a call to convert such a xaringan html file which produces this timeout error on my machine:

pagedown::chrome_print("https://stat540-ubc.github.io/lectures/lectures_2020/lect09_MultipleTesting.html")

The Rmd source for this html is located here. If I increase the timeout argument of chrome_print to something very large (several thousand), the operation appears to take a lot of resources (computer fans turn on, machine becomes hot), but the pdf output is eventually produced. However, if I instead change the output to slidy_presentation in the Rmd instead of xaringan::moon_reader, chrome_print runs successfully on the html and produces a pdf in just a few seconds (with no change to the default timeout argument).

I have the same issue with other slide decks that I have created with a similar template to the one I linked above, but this doesn't happen with every xaringan html file. For example, I am able to use chrome-print to successfully convert this xaringan html file to pdf (with no change to the default timeout argument):

pagedown::chrome_print("https://annakrystalli.me/talks/xaringan/xaringan.html")

Other things I tried:

I installed decktape and used the xaringan::decktape on the xaringan html file, which also produced a timeout error. Though I'm not sure how to increase the time using this method, so I don't know if it would eventually work if given enough time.
I tried using the latest versions of Google Chrome and Chromium with the chrome_print function and got the same results as described above. I'm using Mac OSX 10.15.5.

I would like to stick with xaringan html slides as they have some features I prefer. However, the current method of conversion to pdf is not sustainable for me since I will need to convert many similar htmls, as well as update them periodically. If anyone has come across this or can suggest what might be causing this extreme slowdown when converting my xaringan htmls to pdfs, I'd appreciate your input.

The slides at https://stat540-ubc.github.io/lectures/lectures_2020/lect09_MultipleTesting.html were generated with the option `self_contained: true`, and the result HTML file is huge because it contains base64 encoded data. The JS library (remark.js) is not good at rendering such slides. I do plan to improve it someday: https://github.com/yihui/xaringan/issues/3 For now, I can only suggest that you use the default `self_contained: false`. — Yihui Xie, Sep 10 '20 at 01:40
Wow, thanks for your prompt reply! I just tried your suggestion to use `self_contained: false`, and then the pdf generation step is very fast. I think adding an extra step in my workflow to knit twice (once with `self_contained: false` to generate pdf, and again with `self_contained:true` because I also would like standalone html slides) is worth it to save time. The issue seems to not be entirely attributable to size, however, since when I generate a slidy html it is approximately the same size and the `chrome_print` function is very fast. — kkorthauer, Sep 10 '20 at 02:29
As both Jared and I mentioned in https://github.com/yihui/xaringan/issues/3#issuecomment-626205629, self-contained xaringan slides can be extremely slow to load. The reason is that remark.js has to convert the Markdown source that contains the huge base64 encoded data on the fly in your browser. This conversion is very slow when the Markdown source is too large. Again as I mentioned there, I have an idea about how to remedy it, but haven't had time to implement it yet. — Yihui Xie, Sep 10 '20 at 03:07

pagedown::chrome_print() extremely slow for some xaringan html slides

0 Answers0