31

I'd like to write a little library for JavaScript enums. For me to do that, I need to decide how to store the enum values. Therefore, I'd like to use the fastest way when comparing, but I also want something that is debuggable, so I'm torn between using strings or numbers. I know I could use objects too, but that would be another question

For example

// I don't want this because when debugging, you'd see just the value 0
var Planets = {Earth:0, Mars:1, Venus: 2}

// I'd prefer this so that Planets.Earth gives me a nice readable value ("Earth")
var Planets = {Earth: 'Earth', Mars: 'Mars'}

But I'm afraid that when I compare them using if (myPlanet === Planet.Earth), the string comparison could take a lot longer (say if it were in a tight loop). This should be the case because http://ecma-international.org/ecma-262/5.1/#sec-11.9.6 says

If Type(x) is String, then return true if x and y are exactly the same sequence of characters (same length and same characters in corresponding positions); otherwise, return false.

But when I wrote a test case, I found that they take the same amount of time http://jsperf.com/string-comparison-versus-number-comparison/2 so it doesn't seem like it's scanning the whole string.

I know this could be a micro optimization, but my question is: is string equality comparison done using pointers and therefore just as fast as number equality comparison?

Ruan Mendes
  • 90,375
  • 31
  • 153
  • 217
  • 3
    It's like the spec says: two different strings are the same if the sequence of characters is the same. Because strings are immutable, it's *possible* for the runtime system to collapse separate string instances when they're identical, so a straight pointer comparison could be done before a character-by-character comparison. – Pointy May 23 '14 at 19:12
  • That's exactly my question, it seems to be optimized that way, I'm looking to find when is that not true? When will it scan the string? – Ruan Mendes May 23 '14 at 19:13
  • 1
    You'd probably learn more with a jsperf test using *inequality* instead of equality. – Pointy May 23 '14 at 19:13
  • Let me create that too and add that to the jsperf – Ruan Mendes May 23 '14 at 19:14
  • Well the runtime system would have to do a lot of expensive lookups whenever a string concatenation occurs in order to *guarantee* that each string instance is unique, and that all like-valued references point to the same instance. I doubt it does that. It could be that the parser does it for constants in any single block of code of course. – Pointy May 23 '14 at 19:15
  • I guess you could introduce a planet object with Id and Name if you are worried about string vs number comparisons – TGH May 23 '14 at 19:15
  • @TGH I can, but that makes serialization (on the other end) much harder, the question is not about enums, it's about string comparison – Ruan Mendes May 23 '14 at 19:16
  • Serialization of an object would be solved by overriding `.toString()` and `valueOf()`, no? – cookie monster May 23 '14 at 19:19
  • ...as to your question, even if comparison is done with pointers, would you really want to trust it? Who knows if there's some optimization or other operation done at some unexpected time that would cause the memory location to be altered. – cookie monster May 23 '14 at 19:22
  • @cookiemonster `JSON.stringify({a: {valueOf: function() {return 2}, toString: function() {return '2'} }})` outputs `"{"a":{}}"` You would also need a reviver when deserializing. Again, that's not the point of the question, but thanks for the input. I do understand that I'd be relying of internals of the JavaScript engine and this may change. – Ruan Mendes May 23 '14 at 19:22
  • FWIW: `JSON.stringify({a: {toJSON:function() {return 2; }}}); // {"a":2}` But good point about deserializing. – cookie monster May 23 '14 at 19:24
  • http://jsben.ch/O7bNS – EscapeNetscape Nov 13 '17 at 12:19
  • @Pointy Regarding "a lot of expensive lookups": It could compute a small cheap hash from each new string and compare that against stored hashes first. Only if the hash matches does it have to check the rest of the characters, or (if it is a long string) queue the string as a candidate for later checking/collapsing upon some conditions, such as more hits on the same hash. – Elias Hasle Jul 28 '19 at 23:07

3 Answers3

28

String comparison could be "just as fast" (depending on implementation and values) - or it could be "much slower".

The ECMAScript specification describes the semantics, not the implementation. The only way to Know for Certain is to create an applicable performance benchmark on run it on a particular implementation.

Trivially, and I expect this is the case1, the effects of string interning for a particular implementation are being observed.

That is, all string values (not String Objects) from literals can be trivially interned into a pool such that implIdentityEq("foo", "foo") is true - that is, there need only one string object. Such interning can be done after constant folding, such that "f" + "oo" -> "foo" - again, per a particular implementation as long as it upholds the ECMAScript semantics.

If such interning is done, then for implStringEq the first check could be to evaluate implIdentityEq(x,y) and, if true, the comparison is trivially-true and performed in O(1). If false, then a normal string character-wise comparison would need to be done which is O(min(n,m)).

(Immediate falseness can also be determined with x.length != y.length, but that seems less relevant here.)


1 While in the above I argue for string interning being a likely cause, modern JavaScript implementations perform a lot of optimizations - as such, interning is only a small part of the various optimizations and code hoistings that can (and are) done!

I've created an "intern breaker" jsperf. The numbers agree with the hypothesis presented above.

  1. If a string is interned then comparison is approximate in performance to testing for "identity" - while it is slower than a numeric comparison, this is still much faster than a character-by-character string comparison.

  2. Holding the above assertion, IE10 does not appear to consider object-identity for pass-fast string comparisons although it does use a fast-fail length check.

  3. In Chrome and Firefox, two intern'ed strings which are not equal are also compared as quickly as two that are - there is likely a special case for comparing between two different interned strings.

  4. Even for small strings (length = 8), interning can be much faster. IE10 again shows it doesn't have this "optimization" even though it appears to have an efficient string comparison implementation.

  5. The string comparison can fail as soon as the first different character is encountered: even comparing long strings of equal length might only compare the first few characters.


  • Do common JavaScript implementations use string interning? (but no references given)

    Yes. In general any literal string, identifier, or other constant string in JS source is interned. However implementation details (exactly what is interned for instance) varies, as well as when the interning occurs

  • See JS_InternString (FF does have string interning, although where/how the strings are implicitly interened from JavaScript, I know not)

Community
  • 1
  • 1
user2864740
  • 60,010
  • 15
  • 145
  • 220
  • I did create tests :) From my tests, it seems that Firefox interns even dynamically generated strings, at least in some cases and Chrome doesn't. Also IE is just the slowest with sting comparisons whether they're literals or not. – Ruan Mendes May 23 '14 at 20:11
  • @JuanMendes After much failing, I've finally fixed [my jsperf](http://jsperf.com/internbreaker) such that it can demonstrate the situation. – user2864740 May 23 '14 at 22:46
3

There are cases when string comparison can be much slower (comparing dynamically generated strings)

The following is 77% slower (in chrome and IE) than all the other tests

var StringEarth = 'Ear' + 'th';
for (var i = 0; i < ITERATIONS; i++) {
  x = StringPlanets.Venus === StringEarth;
}

The flaw in the tests mentioned in the question is the fact that we are testing against literal strings. It seems that JavaScript is optimized so that string comparison for string literals is done just by testing a pointer. This can be observed by creating the strings dynamically. My best guess is that strings from the literal string pool are marked so that they can be compared using addresses only.

Note that string comparison seems just as fast in FF even for dynamic strings. Also, that it's just as slow for even literal strings.

Conclusion All browsers behave differently so string comparison may or may not be slower.

Ruan Mendes
  • 90,375
  • 31
  • 153
  • 217
2

In general, at best String interning (making a string with a given value into a unique reference or a O(1) comparable symbol) is going to take O(n) time, as it can't do that effectively without looking at all the characters involved.

The question of relative efficiency then amounts to over how many comparisons the interning is going to be amortized.

In the limit, a very clever optimizer can pull out static expressions which build up strings and intern them once.

Some of the tests above, use strings which will have been interned in which case the comparison is potentially O(1). In the case where enums are based on mapping to integers, it will be O(1) in any implementation.

The expensive comparison cases arise when at least one of the operands is a truly dynamic string. In this case it is impossible to compare equality against it in less than O(n).

As applied to the original question, if the desire is to create something akin to an enum in a different language, the only lookout is to ensure that the interning can is done in only a few places. As pointed out above, different Browser use different implementations, so that can be tricky, and as pointed out in IE10 maybe impossible.

Caveat lacking string interning (in which case you need the integer version of the enum implementation give), @JuanMendes' string-based enum implementations will be essentially O(1) if he arranges for the value of the myPlanet variable to be set in O(1) time. If that is set using Planets.value where value is an established planet it will be O(1).

marc meyer
  • 530
  • 6
  • 6