3

I was doing some research in many websites today and, to avoid looking at them manually, I prepared phantomjs to render them using the solution proposed here. Nothing special. Looping through a website array and rendering all the resulting pages.

What's strange is that there are some websites that are not being properly rendered. Among others, I have this one: http://www.telegraaf.nl/

To simplify, I created another script that only runs this page:

var page = require('webpage').create();

page.viewportSize = { width: 1920, height: 960 };
page.clipRect = { top: 0, left: 0, width: 1920, height: 960 };

page.open('http://www.telegraaf.nl/', function(status) {
  page.render("screenshot.png");
  phantom.exit();
});

It ends in no screenshot. Tested with any other one, and perfectly working. Did I overlook something?

Community
  • 1
  • 1
Sergi Juanola
  • 6,531
  • 8
  • 56
  • 93

1 Answers1

8

It doesn't render a screenshot, because the page has no <body> initially and therefore nothing to render. Everything, including the body, is loaded through JavaScript after PhantomJS' onLoadFinished event fires.

You need to wait a little for a full page load. A simple 5 second wait was sufficient for me:

page.open('http://www.telegraaf.nl/', function(status) {
    setTimeout(function(){
        page.render("screenshot.png");
        phantom.exit();
    }, 5000);
});

You can of course wait in a more fancy way in order to make it more robust and not to wait too long: phantomjs not waiting for “full” page load


You may need to run PhantomJS with --ignore-ssl-errors=true (and maybe --ssl-protocol=any if PhantomJS <1.9.8).

Community
  • 1
  • 1
Artjom B.
  • 61,146
  • 24
  • 125
  • 222
  • Hah! It worked (although I had to wait 30 seconds in 8 out of 11 _problematic_ websites, maybe my connection is really slow). As a reference, I didn't need to use the `--ignore-ssl-errors` flag in these cases for this to work, but it's a good thing to keep set just in case. Thanks! – Sergi Juanola Nov 13 '15 at 08:33
  • I also didn't need to ignore ssl errors, but I noticed that there where some request errors in the log (seen through [events](https://gist.github.com/artjomb/4cf43d16ce50d8674fdf#file-1_phantomerrors-js)), so I added that commandline option and they went away. – Artjom B. Nov 13 '15 at 10:17
  • 30 seconds is a bit much. Maybe you can shave off some seconds with custom ad blocking. The `page.onResourceRequested` event enables you to abort some requests based on your custom criteria like requested domain. – Artjom B. Nov 13 '15 at 10:24
  • After wasting whole day adding cliArgsCap.add("--ignore-ssl-errors=true") helped me to fix this issue. Some how this ssl issue is not allowing page to load properly.But FF/Chrome is working fine – Jeya Kumar Apr 03 '19 at 10:21