Collecting all links on a webpage with phantomjs-node

Asked Feb 17 '15 at 00:43

Active Feb 17 '15 at 01:22

Viewed 201 times

Where I am (adapted from first answer at how to scrape links with phantomjs):

phantom.create(function (ph) {
ph.createPage(function (page) {
  page.open('http://www.' + currentSite + '.com', function (status) {
    if(status == 'success'){

      page.evaluate(function() { return [].map.call(document.querySelectorAll('a')); }, function (result) {
        console.log(result);
      });
....

But unfortunately the result is always coming up as null. Returning the whole document (or directly returning a querySelector on the document) causes my computer to hang up.

Thank you

edited May 23 '17 at 12:31

Community

asked Feb 17 '15 at 00:43

user4573794

possible duplicate of [Retrieved anchors list gets corrupted?](http://stackoverflow.com/questions/24700432/retrieved-anchors-list-gets-corrupted) – Artjom B. Feb 17 '15 at 08:35
Have you tried using Phantomjs' page.content property? You might be able to get all the hrefs in the page this way. Else, try returning document.body.innerHTML instead of [].map.call(document.querySelectorAll('a'), then locate the hrefs in this. – GracefulCode Jan 15 '16 at 15:51

Collecting all links on a webpage with phantomjs-node

0 Answers0