-2

I'm building a web scraper using Nodejs and I use a lot of asynchronous functions which I have written. I want to run a chain of functions using different page ids but for loop doesn't seem to work properly. I tried using counter variables too but it doesn't produce required results.. Please find my code below:

var pageInformation = [
['page1','id111'],
['page2','id222'],
['page3','id333']];

var reqCounter = 0;

for(page in pageInformation){
    var pageName = pageInformation[reqCounter][0];
    var pageId = pageInformation[reqCounter][1]
    getPosts(pageId,function(err,idArray){
        if(!err){
            getMoreData(idArray, function(data,err){
                if(!err){
                    populateDatabase(data, function(err,success){
                        if(!err){
                        reqCounter++;
                            console.log('Loop for ' + pageName + 'has finished');//prints out page1 three times 
                        }
                    })
                }
            })
        }
    })    
}

What happens is console.log() prints out page1 three times and and database gets populated with the first page data only. Any ideas on how I could run this chain of code for each of the pages in the pagesInformation array?

Tomas
  • 1,131
  • 2
  • 12
  • 25
  • 1. Don't use `for...in` with arrays. 2. Checkout the `for (await x of y)` syntax. – Jared Smith Jan 22 '18 at 15:09
  • `reqCounter++` isn't run when you think it runs. Look into `Promise.all()`. Also: https://stackoverflow.com/questions/21184340/async-for-loop-in-node-js –  Jan 22 '18 at 15:09

1 Answers1

1

Your for loop runs synchronously, while your reqCounter is only incremented when each of the asynchronous calls have finished. This means that reqCounter will still be 0 in each iteration.

Furthermore, variables declared with var are not block scoped but scoped to the current function. This means that pageName and pageId will be reassigned in each iteration rather than each iteration having its own version of the variables.

The latter issue can be solved by declaring those variables with let or const as this makes them block scoped, i.e. each iteration gets its own version. Since you never reassign them, const is appropriate.

For the first issue, I can't see why you need the reqCounter in the first place. Just use the iterator variable page.

Finally, it's a bad idea to use a for...in loop on arrays as that can easily cause bugs and unexpected behavior. You should use for...of or forEach() instead.

So, my solution for your problem would be to change these three lines:

for(page in pageInformation){
    var pageName = pageInformation[reqCounter][0];
    var pageId = pageInformation[reqCounter][1]

into this:

for (const page of pageInformation) {
    const pageName = page[0];
    const pageId = page[1];
Lennholm
  • 7,205
  • 1
  • 21
  • 30