3

I have a function that takes an array and a number. It scans the array for the two numbers that appear earliest in the array that add up to the number. I would like to know, performance-wise, what could help this function run faster. It has to process a list of like 10,000,000 items in under 6 seconds. I have refactored it a few times now, but still not getting there.

What is the best array iteration method for speed? I assumed for loops would be the slowest so I chose map. Is there a faster way? every()?

NOTE: the provided array could have duplicate, positive, or negative numbers (let's say up to 1000000...for now).

var low_pair = function (ints, s) {
    var lowNum = ints.length, lowMatch, highNum, clone = [], i;

    for (i = 0; i < ints.length; i++) {
        clone[i] = ints.map(function (n, ind) {
            if (ind !== i && ints[i] + n == s) {
                i > ind ? highNum = i : highNum = ind;
                if (highNum < lowNum) {
                    lowNum = highNum;
                    lowMatch = [ints[i], ints[ind]];
                }
            }
        });
    }
    return lowMatch;
};
tpie
  • 6,021
  • 3
  • 22
  • 41
  • 3
    You should try the for loop, just in case ; ). – Teemu Apr 07 '15 at 18:17
  • Yea, I am timing different methods now... – tpie Apr 07 '15 at 18:17
  • I might be mistaken, but aren't you returning the latest pair with this piece of code? You never break your `for` loop, shouldn't `lowMatch` be returned right where you assign a value to it? (In this case, maybe `map` should be replaced with a loop to allow it to return something in the scope of the `low_pair` function) – blex Apr 07 '15 at 18:20
  • 5
    This also depends on the nature of your data. Are there duplicates? Are they well-distributed? Are they all positive? Etc. Many possible optimizations are data-dependent. – ryanyuyu Apr 07 '15 at 18:21
  • Not in this case. In this version of the function, it is possible for a lower match to appear later on. For example: s = 10 ints = [2, 3, 4, 5, 5, 7, 8] The first match found would be [2, 8], but the earliest occurring match would technically be [5, 5] – tpie Apr 07 '15 at 18:23
  • How exactly do you want to search for the pair that add up to the number? Do you check entry 1+2, and if that doesn't add up, 2+3, then 3+4, etc... Or do you check entry 1 with every other entry to see if it adds up, and then 2 with every other entry (but 1)? – myfunkyside Apr 07 '15 at 18:24
  • That is how this version does it. The challenge arises because you have to return the earliest occurring pair. It isn't efficient. I just ran a timer on 10000 item list and its already over 10 seconds. Something major needs to change... – tpie Apr 07 '15 at 18:28
  • 1
    You answered @myfunkyside question of 'a' or 'b' with a yes. Is it 1+2, then 2+3, then 3+4, or 1+2, 1+3, 1+4 etc.... – ganders Apr 07 '15 at 18:44
  • iterating through the second digit up to array position of the first digit, then increasing the first digit by one and resetting the second digit position to zero seems to be the fastest way I have found. The code in my OP is probably one of the most inefficient ways of doing it because you have to iterate through the entire array once for each array item. – tpie Apr 07 '15 at 18:48
  • This is how I'm seeing your problem: http://jsfiddle.net/hr46zzqk/ Finding the earliest 2 digits that add up to `s`. What did I not understand? – blex Apr 07 '15 at 18:52
  • @tpie: Uh, do you really have 10m values that are *digits* (from 0 to 9) only? If so, that drastically changes the answer. – Bergi Apr 07 '15 at 19:04
  • actually it didn't...see my function and comment below. – tpie Apr 07 '15 at 19:12
  • @tpie: You should adjust your question then. The term "digit" usually refers to letters, and implies that your array contains only single-digit integers. – Bergi Apr 07 '15 at 19:25

4 Answers4

2

We are going to create a function that returns the earliest pair of elements that add up to the needed sum:

function find_pair(l, s) {
    var min_indexes = {};
    for (var i = 0; i < l.length; i++) {
        var a = l[i];
        if ((s - a) in min_indexes)
            return [s - a, a];
        if (!(a in min_indexes))
            min_indexes[a] = i;
    }
}

For this purpose, for every number we process, we store its minimum index. If we currently process number a, we check if s - a has its minimum index set. If yes, this means we found our wanted sum and we return both elements.

For example:

> var l = [2, 3, 4, 5, 5, 7, 8]
> find_pair(l, 10)
[5, 5]
> find_pair(l, 6)
[2, 4]
> find_pair(l, 5)
[2, 3]
> find_pair(l, 15)
[7, 8]
> find_pair([5, 9, 13, -3], 10)
[13, -3]
JuniorCompressor
  • 19,631
  • 4
  • 30
  • 57
1

What is the best array iteration method for speed?

See What's the fastest way to loop through an array in JavaScript? for that. But notice the answers there might be deprecated, and current engines are better at optimising different things. You should always benchmark yourself, in your own target environment.

However, instead of looking for raw speed and microoptimisations, you should try to improve your algorithm. In your case, you can double the speed of your function by simply starting the inner loop at i so you don't visit all combinations twice. Also, by returning early from the function you can speed up the average case (depending on your data). To find the "earliest pair" you don't have to loop through the entire array and calculate a minimum, you just have to iterate the pairs in the chosen order. If the data is ordered (or at least skewed to some distribution) you could take advantage of that as well.

I'd use

function firstPair(ints, s) {
    var len = ints.length;
    for (var end = 0; end < len; end++)
        for (var i=0, j=end; i<end; i++)
            if (i != --j && ints[i]+ints[j] == s)
                return [i, j];
    for (var start = 0; start < len; start++)
        for (var i=start, j=len; i<len; i++)
            if (i != --j && ints[i]+ints[j] == s)
                return [i, j];
    return null;
}

As suggested by the other answers, if the range of the values in your array is limited, you could drastically reduce the complexity of your algorithm by using a lookup table - trading memory for performance. Using a bitmap for already-occured integers, it could look like this:

function firstPair(ints, s) {
    var map = []; // or, if domain is known and typed arrays are supported, use
//  var map = new Uint16Array(Math.ceil(domainSize / 16));
    for (var i=0; i<ints.length; i++) {
        var x = ints[i],
            r = s - x;
        if (map[r >> 4] & (1 << (r & 0xF))) // get
            return [r, x];
        map[x >> 4] |= 1 << (x & 0xF); // set
    }
    return null;
}
Community
  • 1
  • 1
Bergi
  • 630,263
  • 148
  • 957
  • 1,375
0

The main problem is that your current function complexity is O(n^2) which is way too high for a 10000000 element array. The map function iterates through the entire array. So you make 10000000 * 10000000 = 100 trillion "operations". The complexity needs to be decreased. My best guess -> use a hash table within a linear loop. Below is my example code with a worst case test of 10 million elements that runs in around 8 seconds on my old machine. It makes only 10 million runs instead of 100 trillion.

<!DOCTYPE html>
<html>
<head>
<script type="text/javascript">
var low_pair = function (ints, s) {
    var found = {};
    var lowMatch;
    for (var i = 0; i < ints.length; i++) {
        var num = ints[i];
        var prevNum = s-num;
        if (found[prevNum] === true){
            if (prevNum>num){
                lowMatch = [num, prevNum];
            } else {
                lowMatch = [prevNum, num];
            }
            break;
        } else {
            found[num] = true;
        }
    }
    return lowMatch;
};
var test_array_size = 10000000;
var test_array = new Array(test_array_size);
for (var i=0;i<test_array_size;++i){
    test_array[i] = test_array_size-i;
}
console.log("Array initialized");
var start = new Date().getTime();
console.log(low_pair(test_array, 12));
var end = new Date().getTime();
console.log("Running time: "+(end-start)+" ms");
</script>
<head>
<body>
</body>
</html>
Kostadin
  • 16
  • 3
0

This function consistently runs a 50,000,000 items array in 0ms:

var low_pair = function (s) {
var ints = [];
for(var i = 0; i < 50000000; i++) {
    ints.push(Math.floor(Math.random() * 9));
}
console.time('pair');
var counter = 1;
for (var i = 0; i < ints.length; i++) {
    var sum = ints[i] + ints[counter];
    if (i !== counter) {
        if (sum === s) {
            console.timeEnd('pair');
            return console.log([ints[i], ints[counter]]);
        }
    }
    if (i == counter) {
        counter++;
        i = -1;
    }
}
console.time('pair');
console.log( undefined);
};
tpie
  • 6,021
  • 3
  • 22
  • 41
  • 1
    Why doesn't your array contain nines? – Bergi Apr 07 '15 at 19:06
  • good catch. It's just a random number I entered. I did retest the function with random numbers up to 99999 and it runs 50mil at 1ms. Random numbers up to 1mil at less than 100ms. When the random numbers get up to 10mil the performance starts to go down. I had results of 6 and 8 seconds. – tpie Apr 07 '15 at 19:11