0

I'm trying, thanks to PhantomJS, to scrape this webpage : https://www.koshkamashkaeshop.com/fr/28-robes-Koshka-Mashka but it failed to load every time. I thought that it was because of the https. Here is my .sh code :

phantomjs  --ignore-ssl-errors=yes test.js

Here is my test.js code :

page.open(url, function (status) {

var content = page.evaluate(function()
 {
  if (status !== 'success') {
     console.log('FAIL to load the address');
  }else{

  }
 }
)})
Zoomzoom
  • 189
  • 3
  • 13

4 Answers4

5

I know this is old but i got the same error

phantomjs --debug=yes --ignore-ssl-errors=true --ssl-protocol=any --web-security=true rasterize.js url output.pdf

credit: https://github.com/ariya/phantomjs/issues/10178

1

It is the same problem as described at Phantomjs connection to Facebook fails SSL handshake: Phantomjs defaults to SSL 3.0 and lots of sites have SSL 3.0 disabled. You need to use

 phantomjs  --ssl-protocol=any test.js
Community
  • 1
  • 1
Steffen Ullrich
  • 114,247
  • 10
  • 131
  • 172
1

page.evaluate() is the sandboxed page context in PhantomJS. It has no access to variables defined outside. Also, if you want to see console messages from the page context, you need to register to the page.onConsoleMessage event. You don't need the page context in this case.

The other problem is that PhantomJS version < 1.9.8 uses SSLv3 by default, but because of the POODLE vulnerability most webservers have disabled SSLv3 support, so you need to explicitly add the --ssl-protocol=tlsv1 commandline option.

Working code with PhantomJS 1.9.0:

page.open(url, function (status) {
  console.log("status: " + status);
  phantom.exit();
});

Of course, if you really want to pass the status into the page context for whatever reason, you need to pass it explicitly:

page.onConsoleMessage = function(msg){
    console.log("page: " + msg);
};
page.open(url, function (status) {
    page.evaluate(function(status){
        console.log("status: " + status);
    }, status);
    phantom.exit();
});
Artjom B.
  • 61,146
  • 24
  • 125
  • 222
0

if its https, try running

phantomjs --ssl-protocol=TLSv1.1 <filename.js> 

and in addition, add user agent to the code.

e.g.

 var page = require('webpage').create();
 page.settings.userAgent = 'SpecialAgent';

 page.open(url,function(status){}

This worked for me. :)