48

I'm sure my problem is based on a lack of understanding of asynch programming in node.js but here goes.

For example: I have a list of links I want to crawl. When each asynch request returns I want to know which URL it is for. But, presumably because of race conditions, each request returns with the URL set to the last value in the list.

var links = ['http://google.com', 'http://yahoo.com'];
for (link in links) {
    var url = links[link];
    require('request')(url, function() {
        console.log(url);
    });
}

Expected output:

http://google.com
http://yahoo.com

Actual output:

http://yahoo.com
http://yahoo.com

So my question is either:

  1. How do I pass url (by value) to the call back function? OR
  2. What is the proper way of chaining the HTTP requests so they run sequentially? OR
  3. Something else I'm missing?

PS: For 1. I don't want a solution which examines the callback's parameters but a general way of a callback knowing about variables 'from above'.

Michał Perłakowski
  • 88,409
  • 26
  • 156
  • 177
Marc
  • 13,011
  • 11
  • 78
  • 98

3 Answers3

50

Your url variable is not scoped to the for loop as JavaScript only supports global and function scoping. So you need to create a function scope for your request call to capture the url value in each iteration of the loop by using an immediate function:

var links = ['http://google.com', 'http://yahoo.com'];
for (link in links) {
    (function(url) {
        require('request')(url, function() {
            console.log(url);
        });
    })(links[link]);
}

BTW, embedding a require in the middle of loop isn't good practice. It should probably be re-written as:

var request = require('request');
var links = ['http://google.com', 'http://yahoo.com'];
for (link in links) {
    (function(url) {
        request(url, function() {
            console.log(url);
        });
    })(links[link]);
}
JohnnyHK
  • 305,182
  • 66
  • 621
  • 471
  • 3
    It's not about scope. It's about closures. Lots of other languages don't have block scope yet don't face this issue due to lack of closures. – slebetman Nov 04 '12 at 20:02
  • 3
    The point is that if JavaScript did support block scope, then the closure access to `url` in the callback function of the OP's code would work because each iteration of the loop would get its own `url` variable (like in C#). – JohnnyHK Nov 04 '12 at 20:17
  • 1
    The require is just in there for brevity, in my code all requires are at the beginning. – Marc Nov 05 '12 at 09:14
  • If I don't use forloop within the callback it will return url undefined. – Aero Wang Mar 12 '18 at 02:27
13

Check this blog out. A variable can be passed by using .bind() method. In your case it would be like this:

var links = ['http://google.com', 'http://yahoo.com'];
for (link in links) {
var url = links[link];

require('request')(url, function() {

    console.log(this.urlAsy);

}.bind({urlAsy:url}));
}
Tjs
  • 843
  • 10
  • 17
7

See https://stackoverflow.com/a/11747331/243639 for a general discussion of this issue.

I'd suggest something like

var links = ['http://google.com', 'http://yahoo.com'];

function createCallback(_url) {
    return function() {
        console.log(_url);
    }
};

for (link in links) {
    var url = links[link];
    require('request')(url, createCallback(url));
}
Community
  • 1
  • 1
BobS
  • 2,588
  • 1
  • 15
  • 15
  • There are lots of better links to give on StackOverflow to explain the issue. This one for example: http://stackoverflow.com/questions/3572480/please-explain-the-use-of-javascript-closures-in-loops/3572616#3572616 – slebetman Nov 04 '12 at 19:59