3

I discovered a weird behavior in nodejs/chrome/v8. It seems this code:

var x = str.charCodeAt(5);
x = str.charCodeAt(5);

is faster than this

var x = str.charCodeAt(5); // x is not greater than 170
if (x > 170) {
  x = str.charCodeAt(5);
}

At first I though maybe the comparison is more expensive than the actual second call, but when the content inside the if block is not calling str.charCodeAt(5) the performance is the same as with a single call.

Why is this? My best guess is v8 is optimizing/deoptimizing something, but I have no idea how to exactly figure this out or how to prevent this from happening.

Here is the link to jsperf that demonstrates this behavior pretty well at least on my machine: https://jsperf.com/charcodeat-single-vs-ifstatment/1

jsperf


Background: The reason i discovered this because I tried to optimize the token reading inside of babel-parser.

I tested and str.charCodeAt() is double as fast as str.codePointAt() so I though I can replace this code:

var x = str.codePointAt(index);

with

var x = str.charCodeAt(index);
if (x >= 0xaa) {
  x = str.codePointAt(index);
}

But the second code does not give me any performance advantage because of above behavior.

danez
  • 1,659
  • 1
  • 12
  • 15
  • In Firefox your second option is faster then the first one (for Latin characters). Looks like something is buggy in V8. – Dmitry Jan 13 '19 at 08:24
  • Is it `var` or `const`? Is it `charCodeAt` or `codePointAt`? The snippets in your question and those in your jsperf screenshot do totally different things. – Bergi Jan 13 '19 at 09:57
  • Yes, `var y = str;` (where `y` is never used afterwards) can trivially be recognised as dead code, and so is the side-effect-free comparison in the condition of an empty `if` statement. – Bergi Jan 13 '19 at 10:00
  • 2
    If you are optimising babel-parser, then use that as a proper benchmark, instead of [trying to do micro-benchmarks](https://mrale.ph/blog/2012/12/15/microbenchmarks-fairy-tale.html). – Bergi Jan 13 '19 at 10:01
  • Your first code block is a TypeError in any complying environment. You can't assign to a `const` other than in the (required) initializer. – T.J. Crowder Jan 13 '19 at 10:35
  • Re your underlying question of whether to micro-optimize looking at the next code point: If your stream is overwhelmingly code unit = code point, it does seem there's a *slight* advantage. `charCodeAt` + `codePointAt` only if the unit was a surrogate is faster in the case where it isn't one (of course), and slower in the case where it is. Faster still is doing the calc yourself (since you already have the first unit. Oddly, even wrapping that in a function is faster. (In a [synthetic test](https://jsperf.com/second-guessing-codepointat)...) WhIch suggests I'm not allowing for something... :-) – T.J. Crowder Jan 13 '19 at 11:34

1 Answers1

8

V8 developer here. As Bergi points out: don't use microbenchmarks to inform such decisions, because they will mislead you.

Seeing a result of hundreds of millions of operations per second usually means that the optimizing compiler was able to eliminate all your code, and you're measuring empty loops. You'll have to look at generated machine code to see if that's what's happening.

When I copy the four snippets into a small stand-alone file for local investigation, I see vastly different performance results. Which of the two are closer to your real-world use case? No idea. And that kind of makes any further analysis of what's happening here meaningless.

As a general rule of thumb, branches are slower than straight-line code (on all CPUs, and with all programming languages). So (dead code elimination and other microbenchmarking pitfalls aside) I wouldn't be surprised if the "twice" case actually were faster than either of the two "if" cases. That said, calling String.charCodeAt could well be heavyweight enough to offset this effect.

jmrk
  • 34,271
  • 7
  • 59
  • 74
  • Thanks for your answer and the details. I wasn't blindly trying to microoptimize, but trying to see if the tokenizer can be improved as it is one of the slowest parts. It is scanning over every single character with .codePointAt whereas most files won't contain much unicode. I will keep investigating. – danez Jan 15 '19 at 07:02
  • Micro*optimizing* is fine and appropriate for performance-critical code; it's micro*benchmarking* that tends to be misleading (unless you've verified that you're actually measuring what you think you're measuring). I'd recommend to use some real-world case as a benchmark (i.e. take a bunch of real code and run it through your real tokenizer), and use *that* framework to evaluate different implementation strategies inside the tokenizer. If you can't see a difference under those circumstances, then it doesn't make a difference. – jmrk Jan 15 '19 at 18:57