1

In my Chrome extension, I have an array of URLs, and I want to find the first unvisited one. Because the chrome.history API is asynchronous, my first instinct would be to do some kind of grotesque recursive iteration, like this...

urls = [...];
function recur(idx) {
    chrome.history.getVisits(urls[idx], function(visitItems) {
        if(visitItems && visitItems.length > 0) {
            // Success!
        } else {
            recur(idx + 1);
        }
    }
}
recur(0);

But, this sucks (it's really ugly, it's probably very slow, and it will break for long lists).

Is there any way to better wrangle all of these calls to chrome.history? Or, is there a totally distinct option?

Tim S. Van Haren
  • 8,861
  • 2
  • 30
  • 34
Chuck
  • 636
  • 3
  • 8
  • 20

1 Answers1

1

If order is important and your list is long and the probability of finding an unvisited link sooner than later is high, then the best way is basically to do what you are doing. Here is the forEachSeries implementation from a popular async library.

async.forEachSeries = function (arr, iterator, callback) {
    callback = callback || function () {};
    if (!arr.length) {
        return callback();
    }
    var completed = 0;
    var iterate = function () {
        iterator(arr[completed], function (err) {
            if (err) {
                callback(err);
                callback = function () {};
            }
            else {
                completed += 1;
                if (completed === arr.length) {
                    callback(null);
                }
                else {
                    iterate();
                }
            }
        });
    };
    iterate();
};

You'll see the same recursive pattern that you've started to implement. The other option is to kick them all off in parallel and track them as they come back. You can exit out as soon as the first item in the list comes back as unvisited. Note: the following code is untested...

var urls = [a,b,c,d],
    unvisitedUrls = [],
    count = urls.length,
    done = false;

var checkUrl = function(d) {
  var url = d;

  return function(visitItems) {
    if (done) return;
    count--;

    if (visitItems && visitItems.length > 0) {
        unvisitedUrls.push(url);
    }
    else {
        urls.splice(urls.indexOf(url));  // remove the visited url
    }

    if(unvisitedUrls.indexOf(urls[0]) > -1 || count === 0) {
        done = true;
        // done checking urls, urls[0] is the winner
    }

  }
}


urls.forEach(function(d) { chrome.history.getVisits(d, checkUrl(d)); });

If your list is millions of items long, then you can iterate through them in batches instead of all at once. Here's an example using the async library found at https://github.com/caolan/async.

var checkUrl = function(url, cb) {

  chrome.history.getVisits(url, function(itemVisits) {   

    if (done) return cb();
    count--;

    if (visitItems && visitItems.length > 0) {
        unvisitedUrls.push(url);
    }
    else {
        urls.splice(urls.indexOf(url));  // remove the visited url
    }

    if(unvisitedUrls.indexOf(urls[0]) > -1 || count === 0) {
        done = true;
        // done checking urls, urls[0] is the winner
    }

    cb();
  }
};

async.forEachLimit(urls, 50, checkUrl, function(err) { doSomethingWithWinner(); });
Bill
  • 25,119
  • 8
  • 94
  • 125
  • So, this will check every URL in the (potentially long) list. When do you find which one is _first_? And how do you want until everything before that is checked, so that you know that it is truly the first? – Chuck Oct 23 '12 at 03:43
  • How long of a list? What do you consider first? First based on your original URL list? – Bill Oct 23 '12 at 03:45
  • Yes--the original question says that I want to find the first unvisited URL in the list. Likewise, it expresses concern that a recursive approach would result in a stack overflow (thus, a large list) – Chuck Oct 23 '12 at 03:47
  • I've provided a couple of other options. I guess I incorrectly assumed that a long list was hundreds of items not hundreds of thousands or millions. – Bill Oct 23 '12 at 04:33
  • You've convinced me that there just isn't a good answer to this question, which is as good an answer as any. Thanks for the detailed response. – Chuck Oct 25 '12 at 01:18