0

Basically I'm getting the page I want using NodeJS, but it reads it too fast. The site I'm trying to read stuff from loads the divs (posts) that I want to read too slowly, so I get the body returned before the posts (that loads "dynamically"/after a second or so).

Not sure if I make any sense here but anyways.

var options = {
    url: array[i],
    headers: {
        'User-Agent': 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13'
            }
}
request(options, function(error, response, body) {
    if(!error && response.statusCode == 200) {
           console.log(body);
}

The body contains the webpage, but without the posts that takes a second or so to load.

prk
  • 3,781
  • 6
  • 17
  • 27

1 Answers1

0

Looks like you'll need to use a headless browser such as phantom.js since the JavaScript will not be executed and so dynamic content will not be loaded.

This thread might help:

How can I scrape pages with dynamic content using node.js?

Community
  • 1
  • 1
Olena Vikariy
  • 217
  • 1
  • 3
  • 9