18

This came as a huge surprise for me, and I'd like to understand this result. I made a test in jsperf that is basically supposed to take a string (that is part of a URL that I'd like to check) and checks for the presence of 4 items (that are in fact, present in the string).

It checks in 5 ways:

  1. plain indexOf;
  2. Split the string, then indexOf;
  3. regex search;
  4. regex match;
  5. Split the string, loop through the array of items, and then check if any of them matches the things it's supposed to match

To my huge surprise, number 5 is the fastest in Chrome 21. This is what I can't explain.

In Firefox 14, the plain indexOf is the fastest, that one I can believe.

Community
  • 1
  • 1
João Pinto Jerónimo
  • 9,586
  • 15
  • 62
  • 86
  • 1
    Makes sense to me actually... you are iterating over the string only once instead of four times. Although of course `indexOf` breaks out earlier, but I could imagine that the string is just too short for this to have an impact (that does not explain the results for Firefox though :-/). – Felix Kling Aug 02 '12 at 09:21
  • 1
    Great question. +1, especially for making a test case. – starbeamrainbowlabs Aug 02 '12 at 09:53
  • @FelixKling what do you mean by iterating over the string once instead of 4 times ? It does go through the string 4 times to evaluate each of the 4 items... – João Pinto Jerónimo Aug 02 '12 at 10:00
  • 1
    Actually I later realised that I meant iterating array... I was thinking about your test cases 4 and 5 (4 being the one where you iterate over the array and 5 where you use `indexOf`). `indexOf` always has to iterate over the array until it finds a match and you do this four times, so you potentially have to iterate over the whole array four times compared to just one time with your `while` loop. Similar for the first test case (four times iterating over the string). – Felix Kling Aug 02 '12 at 10:56
  • I created http://jsperf.com/finding-components-of-a-url/4 , which adds some details and provides consistent boolean results. – slevithan Aug 02 '12 at 17:56

2 Answers2

9

I'm also surprised but Chrome uses v8, a highly optimized JavaScript engine which pulls all kinds of tricks. And the guys at Google probably have the largest set of JavaScript to run to test the performance of their implementation. So my guess is this happens:

  1. The compiler notices that the array is a string array (type can be determine at compile time, no runtime checks necessary).
  2. In the loop, since you use ===, builtin CPU op codes to compare strings (repe cmpsb) can be used. So no functions are being called (unlike in any other test case)
  3. After the first loop, everything important (the array, the strings to compare against) is in CPU caches. Locality rulez them all.

All the other approaches need to invoke functions and locality might be an issue for the regexp versions because they build a parse tree.

Aaron Digulla
  • 321,842
  • 108
  • 597
  • 820
3

I have added two more tests : http://jsperf.com/finding-components-of-a-url/2

The single regExp is fastest now (on Chrome). Also regExp literals are faster than string literals converted to RegExp.

HBP
  • 15,685
  • 6
  • 28
  • 34
  • Interesting fact about regexp literals, but the combined regexp does not serve as an alternative, as I wanted to end up with 4 variables that evaluate to true or false (or 0 or 1) weather the strings were found or not. – João Pinto Jerónimo Aug 02 '12 at 10:03