16

I wanted to write a function to find a contiguous subarray within a given array from a given starting index and return the index of the subarray within the array if it's found, and -1 if it's not found. This is similar to String.indexOf, but for arrays and subarrays instead of strings and substrings.

This is my working code:

var find_csa = function (arr, subarr, from_index) {
    if (typeof from_index === 'undefined') {
        from_index = 0;
    }

    var i, found, j;
    for (i = from_index; i < 1 + (arr.length - subarr.length); ++i) {
        found = true;
        for (j = 0; j < subarr.length; ++j) {
            if (arr[i + j] !== subarr[j]) {
                found = false;
                break;
            }
        }
        if (found) return i;
    }
    return -1;
};

And these are my tests and their expected values:

console.log(find_csa([1, 2, 3, 4, 5], [2, 3, 4]) === 1);
console.log(find_csa([1, 2, 3, 4, 5], [5]) === 4);
console.log(find_csa([1, 2, 3, 4, 5], [1, 3]) === -1);
console.log(find_csa([1, 2, 3, 4, 5], [42]) === -1);
console.log(find_csa([1, 2, 3, 4, 5], []) === 0);
console.log(find_csa([3, 4, 3, 4, 3, 4], [3, 4, 3], 1) === 2);
console.log(find_csa([6, 6, 6, 7], [6, 6, 7]) === 1);
console.log(find_csa([12, 9, 16, 42, 7, 866, 3], [16, 42, 7, 866]) === 2);

My code passes the tests, but as you can see, it uses a boolean value found in the inner loop which is just my messy, ad-hoc way of continuing an outer loop from a nested loop. is there a cleaner way of writing it? I looked into Array.prototype.findIndex but it is an experimental technology at the moment so I can't use it. I want a method that works in most browsers. I know there is a "polyfill" code snippet written on the Mozilla page, but that is even longer than my current code and it will be slower due to the function calls, so I'd rather avoid it.

My primary goal for this function is performance (the subarrays will be very small, so I believe that using Boyer-Moore string search algorithm or tries is a tad overkill), and then my secondary goal is elegance of my implementation. With those two goals in mind, I would like to know if there is a better way of writing this code, or if there are any JavaScript features or functions that I'm missing that could help me avoid the found boolean.

JSFiddle if it helps anyone: http://jsfiddle.net/qc4zxq2p/

Shashank
  • 13,713
  • 5
  • 37
  • 63
  • What about 1. `JSON.stringify` 2. `array.prototype.slice` – zerkms Apr 03 '15 at 03:52
  • Don't join with `''` but join with `,` instead. String.indexOf would still work but it will avoid interpreting `[2,11,5]` and `[21,15]` as the same thing – slebetman Apr 03 '15 at 04:00
  • @zerkms, I could see JSON.stringify working to find the existence of a contiguous subarray, but do you know how you would retrieve its index? I'm guessing you would have to split on commas or something..but then you would be back to square 1. At your number 2 method, I fail to see how `Array.prototype.slice` would help. I understand you can use it to convert Strings into arrays using `call` but that is not the problem. – Shashank Apr 03 '15 at 04:01
  • @Shashank I've come up with a different approach. See if it satisfies your need – mohamedrias Apr 03 '15 at 04:03
  • @slebetman Good point...but I want the index of the subarray within the original array. Not the index of the substring within the string... `[12, 44, 65].join(',')` yields `'12,44,65'` but then searching for `[44,65].join(',')` using `String.indexOf` gives you 3 since it starts at the first character, not 1. I've removed the `Array.prototype.join` method from my question because I don't think it will work for values that translate to different length strings like 9999 for example. – Shashank Apr 03 '15 at 04:04
  • Is performance really a requirement? If it is don't do it this way. You can do this much faster - I doubt string.indexOf runs at `O(n^2)` - it probably uses a trie. – Benjamin Gruenbaum Apr 03 '15 at 12:39
  • @BenjaminGruenbaum Well the algorithm is `O(n*m)` where `n` is the length of the haystack and `m` is the length of the needle. As stated in the question, the "needles" will be very small, 4-5 elements at most and the haystacks will not be too large either. That's why I'm not bothering with anything advanced for now, as it might actually make things slower with such a small `m`, though I may consider it in the future. Thank you for the suggestion though, I'm interested in learning more about tries and how they can be used for search. – Shashank Apr 03 '15 at 16:40

7 Answers7

15

Are there any JavaScript features or functions that I'm missing that could help me avoid the found boolean

Yes, you can use a label on your outer loop:

function find_csa(arr, subarr, from_index) {
    var i = from_index >>> 0,
        sl = subarr.length,
        l = arr.length + 1 - sl;

    loop: for (; i<l; i++) {
        for (var j=0; j<sl; j++)
            if (arr[i+j] !== subarr[j])
                continue loop;
        return i;
    }
    return -1;
}
Bergi
  • 630,263
  • 148
  • 957
  • 1,375
  • Nice, but I have two questions, what is the `>>>` syntax that you're using? Is that a standard way to set default argument values? Also is the use of labels considered bad practice? The reason I ask is because it says to avoid using them here: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/label – Shashank Apr 03 '15 at 04:16
  • 3
    Not standard, `>>> 0` is a hacky way of saying "convert to integer". `x >>> y` is "shift `x` to the right `y` bits"; if you shift 0 bits, it's the same thing you had before, but since `>>>` only works on integers you get the sideeffect of coercion to an integral value. It's probably the fastest way of doing it, along with `x|0` (which does the same thing, AFAIK). Of course, assuming that your arguments are already integers is even faster, if you know the caller will obey the contract. :) – Amadan Apr 03 '15 at 04:17
  • 1
    @Shashank: The `>>>` unsigned bitshift is used to [cast the `from_index` to an array-length integer](http://stackoverflow.com/q/3081987/1048572). Labels are totally safe, it seems some people don't understand how they work and try to avoid this feature. – Bergi Apr 03 '15 at 04:19
  • 1
    @Bergi I don't think it's because people don't understand them. It's probably because code that *has* to be written using labels to be readable contains already too much loop nesting and other control flow to be unreadable by itself, so they are suggesting to refactor the code, possibly splitting it in functions. Like, in C goto is a perfectly valid statement... in the very *very* few cases where it improves the code, while in the rest of the cases is a signal for refactoring spaghetti code. – Bakuriu Apr 03 '15 at 15:17
  • @Bakuriu: Yeah, maybe. To be really clear, it should probably be written like in Radiodef's answer. – Bergi Apr 03 '15 at 18:19
8

This is the same as yours, just prettified a bit (at least to my aesthetics):

var find_csa = function (arr, subarr, from_index) {
    from_index = from_index || 0;

    var i, found, j;
    var last_check_index = arr.length - subarr.length;
    var subarr_length = subarr.length;

    position_loop:
    for (i = from_index; i <= last_check_index; ++i) {
        for (j = 0; j < subarr_length; ++j) {
            if (arr[i + j] !== subarr[j]) {
                continue position_loop;
            }
        }
        return i;
    }
    return -1;
};
Amadan
  • 191,408
  • 23
  • 240
  • 301
  • 2
    I like this answer. I did not know there were labels in JS. That certainly does clean up my code a bit. Though on the [Mozilla documentation for labels](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/label) it warns against using them. Is it considered bad practice? – Shashank Apr 03 '15 at 04:11
  • 2
    http://stackoverflow.com/questions/46496/should-i-avoid-using-java-label-statements - tl;dr: if the program is more understandable and more concise when using a label, use a label (but it often isn't). – Amadan Apr 03 '15 at 04:14
4

The inner loop can be reduced to a single line using the array method every:

if(subarr.every(function(e, j) { return (e === arr[i + j]); })
    return i;

or (ES6 proposal):

if(subarr.every( (e, j) => (e === arr[i + j]) ))
    return i;

But this may be just a curiosity or educational example, unless you don't care about performance.

Radiodef
  • 37,180
  • 14
  • 90
  • 125
1

Inside your loop, you can eliminate the found variable and avoid continue like this:

for (j = 0; j < subarr.length; ++j) {
    if (arr[i + j] !== subarr[j]) break;
}
/*
 * the above loop breaks in two cases:
 * normally: j === subarr.length
 * prematurely: array items did not match
 * we are interested in kowing if loop terminated normally
 */
if (j === subarr.length) return i;

Having said that, here is my solution using Array.join and String.indexOf. This is only good for array of Numbers:

function find_csa(arr, subarr, from_index) {
    from_index |= 0;
    if (subarr.length === 0) {
        return from_index;
    }
    var haystack = "," + arr.slice(from_index).join(",") + ",",
        needle = "," + subarr.join(",") + ",",
        pos = haystack.indexOf(needle);
    if (pos > 0) {
        pos = haystack.substring(1, pos).split(",").length + from_index;
    }
    return pos;
}
console.log("All tests should return true");
console.log(find_csa([1, 2, 3, 4, 5], [1, 2, 3]) === 0);
console.log(find_csa([1, 2, 3, 4, 5], [2, 3, 4]) === 1);
console.log(find_csa([1, 2, 3, 4, 5], [5]) === 4);
console.log(find_csa([1, 2, 3, 4, 5], [6]) === -1);
console.log(find_csa([1, 2, 3, 4, 5], [1, 3]) === -1);
console.log(find_csa([6, 6, 6, 7], [6, 6, 7]) === 1);
console.log(find_csa([1, 2, 3, 4, 5], []) === 0);
console.log(find_csa([3, 4, 3, 4, 3, 4], [3, 4, 3], 1) === 2);
console.log(find_csa([1, 2, 3, 4, 5], [], 1) === 1);
Salman A
  • 262,204
  • 82
  • 430
  • 521
  • If the assumption is made that the array elements are all Numbers, this approach will work very well. +1 – Shashank Apr 03 '15 at 20:38
1

Reading initial discussion based on zerkms proposition, I was interested to try a solution using JSON.stringify, despite the unfavorable opinions.

Then I finally got a solution, which passes all tests properly.
Probably not the faster method, but surely the shortest one:

var find_csa = function (arr, subarr, from_index) {
  var start=from_index|0,
      needle=JSON.stringify(subarr),
      matches=JSON.stringify(arr.slice(start)).
      match(new RegExp('^\\[(.*?),?'+
        needle.substr(1,needle.length-2).replace(/([\[\]])/g,'\\$1')
      ));
  return !!matches?(matches[1].length?matches[1].split(',').length:0)+start:-1;
}

The above code accepts arrays of arrays, as suggested by Shashank, but fails to process items containing commas.

So I developped another solution which also accepts commas (thanks to Steven Levithan for the elegant tip about while(str!=(str=str.replace(regexp,replacement)));).

But it is only for fun, since:

  • the code is not so short, now... Sigh!
  • it probably consumes a lot of CPU time
  • it doesn't properly work with empty items (they are ignored)
  • I suspect (and didn't dig deeper :-) it might fail with complex objects as items

Anyway, here is it:

var find_csa = function (arr, subarr, from_index) {
  var start=from_index|0,
      commas=new RegExp('(?:(\')([^,\']+),([^\']+)\'|(")([^,"]+),([^"]+))"'),
      strip_commas='$1$2$3$1$4$5$6$4',
      haystack=JSON.stringify(arr.slice(start)),
      needle=JSON.stringify(subarr).replace(/^\[(.*)\]$/,'$1');
  while(haystack!=(haystack=haystack.replace(commas,strip_commas)));
  while(needle!=(needle=needle.replace(commas,strip_commas)));
  matches=haystack.match(new RegExp('^\\[(.*?),?'+needle.replace(/([\[\]])/g,'\\$1')));
  return !!matches?(matches[1].length?matches[1].split(',').length:0)+start:-1;
}
cFreed
  • 4,404
  • 1
  • 23
  • 33
  • This is pretty impressive and unique as a solution. It fails when you're testing arrays of strings with commas in them, or arrays of arrays, for obvious reasons, but for arrays of numbers it seems to be perfect. – Shashank Apr 03 '15 at 20:35
  • @Shashank: you're right! I didn't pay attention to more than the test examples. I edited my solution to accept arrays of arrays. Hope to come back later for commas :-) – cFreed Apr 03 '15 at 21:34
  • @Shashank: Well, it now accepts commas too... but code became quite dirty. Not a real solution, actually. – cFreed Apr 04 '15 at 13:44
  • Yeah...that's a painful way to do it, but it seems like a good learning experience. – Shashank Apr 04 '15 at 17:22
0

i fixed this question with this code:

 getCount(arr)
{
  const chunked = [];

  for(let i=0; i<arr.length; i++) chunked[i] = [];
  let sub = 0;
  for (let i = 1;  i < arr.length; i++) {
    if (arr[i]>arr[i-1]) {
      chunked[sub].push(...[arr[i-1],arr[i]]);
    } else {
      sub++
    }
  }
  const chunked2 = [...chunked.filter(k=>k.length).map(k => [...new Set(k)])];

  for (let i = 0; i < chunked2.length; i++) {
    if (chunked2[i+1])
    if( chunked2[i][chunked2[i].length - 1] > chunked2[i+1][0]) {
      chunked2[i+1].shift();
    }
  }
  return [].concat.apply([], chunked2).lenght;
}
Ehsan
  • 308
  • 6
  • 17
0

I handle the outlier instances at the top and only looped once, checking to see if the current element was the same as the first element of the subarr.

If it was, I created a subarray to match the length of subarr, starting with the current index. I then used stringify to check if my created subarray matched subarr. If it did or the length was only 1 (which was already confirmed to match above), I returned the current index.

If no match was found -1 was returned.

function find_csa(arr, subarr, from_index) {
    
    if (typeof from_index === 'undefined') {
        from_index = 0;
    }
    if (subarr.length == 0) {
            return 0
    }

    
    for (let i = from_index; i < arr.length; i++) {
        if (arr[i] == subarr[0]) {
            if (subarr.length == 1 
                    || JSON.stringify(arr.slice(i, i + subarr.length))===JSON.stringify(subarr)) {
                return i
            }
        }
    }
    return -1
}
Meg
  • 11
  • 1