1

I'm trying to make a web-scraper in Node JS and I've hit a roadblock. I need to click on a button BUT, if I'm not mistaken, Node doesn't actually render the web-page like a browser would so I can't use a selector or X-Path.

How then, could I click a specific button with the value "yes" if I can't use the selector or X-path? There's no id unique to only the yes button.

I'm asking this because I want to parse a specific web-page but I get redirected to a page that asks me to press two buttons. Pressing 'Yes' will bring me to the page I want. Pressing 'No' will obviously stop me from going forward.

Is there any way to do what I want within the confines on Node without having to resort to something like JSDOM?

Here's part of the HTML i'm working with:

<div class="buttons">
<button class="c-btn c-btn-primary" type="submit" name="bigbutton" value="no">no thank you</button>
<button class="c-btn c-btn-primary" type="submit" name="bigbutton" value="yes">continue</button>
</div>

I tried using something like this:

document.getElementByID("selector").click()

but was returned with 'ReferenceError: document is not defined'.

A B
  • 65
  • 1
  • 9
  • Possible duplicate of [Why doesn't node.js have a native DOM?](http://stackoverflow.com/questions/6657216/why-doesnt-node-js-have-a-native-dom) – Sterling Archer Apr 12 '16 at 01:53
  • 3
    You can't really "click" anything when dealing with scraped content on a webserver, maybe you wanted a headless browser or a bot instead – adeneo Apr 12 '16 at 01:54
  • Is it possible to simply find out the URL of the redirected page beforehand and have Node scrap it directly? – Soubhik Mondal Apr 12 '16 at 02:43
  • If I make a request to go to url X it instantly redirects me to page Y, which is what we have here, two buttons. Clicking continue brings me to the page X i wanted to go to. I have a script written for the data parsing when I actually land on page X but can't seem to get past Y without clicking continue. – A B Apr 12 '16 at 02:48
  • So I figured out I can't even get past the page itself if I disable cookies- Is there a way to add a cookie that works to my node script so that it clears through? – A B Apr 12 '16 at 05:08
  • Sounds like you should be using a headless browser like phantom to do the scraping (or at least get you to a point where you can do the scraping) – Nick Tomlin Apr 12 '16 at 12:29
  • Zombie is a great option. – jupi May 24 '17 at 17:47

1 Answers1

1

Have you tried to use Zombie? I've used and worked well! This link is very helpful, since clarify quickly how to perform actions.

jupi
  • 461
  • 5
  • 9