2

I am trying to download a webpage using PhantomJS with the code shown below, where "address" is the url and "dir" is the file path where i download the web page code.

var system = require('system');
var page = require('webpage').create();
var fs = require('fs');

// Set the url address
address = system.args[1];

// Set the file path
var dir = system.args[2];

page.open(address, function () {
    fs.write(dir, page.content, 'w');
    phantom.exit();
});

This works correctly in many webpages, but in this case ("http://www.lefties.com/es/es/woman/zapatos-c1029521.html") i can't see the href of the products because when I download it with phantomJS or without it, what is downloaded is a fullscreen popup with the cookie subscription. That makes no way to to find the products href in the html downloaded.

In addition, PhantomJS shows this error when i download it:

TypeError: 'null' is not an object (evaluating '$('PopupFullscreen').getElementById('Close').setStyles')

Any idea to avoid the subscription/cookie popup?

enter image description here

Artjom B.
  • 61,146
  • 24
  • 125
  • 222
amarincolas
  • 141
  • 14

2 Answers2

1

Well, using a cookie (stored in my browser) in the script solves the problem. For further information check: http://phantomjs.org/api/webpage/method/add-cookie.html

amarincolas
  • 141
  • 14
1

Cookie modal dialogs like this are common now. You can almost always close those dialogs. Click on the close button to close it.

Just because there is this model dialog, doesn't mean that you can't access the DOM behind it. The markup is still there (aside from the possibly missing markup because of the TypeError).

That error message appears, because the page JavaScript uses some feature that is not implemented in PhantomJS 1.x. If you use PhantomJS 2 it will go away.

Community
  • 1
  • 1
Artjom B.
  • 61,146
  • 24
  • 125
  • 222