I've got a long text file that I loop through line by line to extract some event data and store it in a database. The file periodically gets updated with new data at the top. When that happens, I run through the file again extracting the new events, but I want to stop when I get to an event that's already in the database (the file is always ordered newest to oldest).
Using the reduce()
approach described in this answer to the question Correct way to write loops for promise, I've come up with this function to parse the file:
function parse(
file)
{
var lines = file.split("\n"),
latestDate;
return lines.reduce(function(promise, line) {
return promise.then(function() {
if (/* line matches date pattern */) {
latestDate = line;
} else if (/* line matches event pattern */) {
return Event.createAsync(line, latestDate);
}
return promise;
});
}, Promise.resolve())
.catch({ errorName: "uniqueViolated" },
function() { /* ignore only the createAsync error */ });
}
The createAsync()
database method returns a promise that's resolved when the event is saved. It will throw an exception if the event already exists in the database, which stops the promise chain so the rest of the file isn't parsed. That exception is caught and ignored by the catch()
handler at the end of the function. I'm using the Bluebird 3.0 promise library in Node.js.
This function does loop through each line serially and correctly stops when it hits an already saved event. But I'm wondering if this is the best way to break out of a loop while dealing with promises. Swallowing the thrown exception at the end of the function seems a bit kludgy.
Any suggestions for improving the loop handling are welcome.
Solution?
Building on jib's answer, and taking into account Bergi's comment that maybe I should've just tried his non-reduce answer to the question I linked to :), I came up with this solution:
function parse(
file)
{
var lines = file.split("\n"),
latestDate;
return promiseEach(lines, function(line) {
if (/* line matches date pattern */) {
latestDate = line;
} else if (/* line matches event pattern */) {
return Event.createAsync(line, latestDate)
.catch({ errorType: "uniqueViolated" }, function() { return false; });
}
});
}
The loop recursion is moved into a generic function, promiseEach()
, that loops over every item in an array. If the iterator function returns a promise, the next item isn't processed until that promise resolves. If the iterator returns false
, then the loop ends, Lo-dash style:
function promiseEach(
list,
iterator,
index)
{
index = index || 0;
if (list && index < list.length) {
return Promise.resolve(iterator(list[index])).then(function(result) {
if (result !== false) {
return promiseEach(list, iterator, ++index);
}
});
} else {
return Promise.resolve();
}
}
I think that does what I want, but I'm wondering if there'll be call stack issues if I run it over a 4000-line file.