3

Consider an illustrative example:

function addSafely(weakset, value) {
    if (canBeAddedToWeakSet(value)) {
        weakset.add(value);
    }
} 

So, what do you think is a concise and correct way to implement canBeAddedToWeakSet function in this case? Performance is a nice to have too.

I use the following implementation right now but I'm not sure if it is a complete one (i.e. covers all cases):

function canBeAddedToWeakSet(value) {
    return !!value && (typeof value === 'object' || typeof value === 'function');
}

UPD: So, here are some benchmark results for the ones who are curious about the performance.

noomorph
  • 837
  • 1
  • 6
  • 14
  • Is there a particular reason you don't simply try it, catch the exception and handle the error? "Look before you leap" vs. "ask for forgiveness"… – deceze Sep 21 '17 at 09:41
  • You see, this is a very-very frequent operation in my case and I'd prefer to care about performance on various browsers.But, generally, `try-catch` is also fine, thanks for mentioning. Also, I'm asking out of a theoretical interest. – noomorph Sep 21 '17 at 09:50
  • 1
    @deceze while that would seem to be the easier option, using `try` `catch` destroys most forms of possible optimization that the JIT compiler could do in that function scope, so in this case "look before you leap" would have the benefit of being more performant. – Patrick Roberts Sep 21 '17 at 09:58
  • Fair enough… :) – deceze Sep 21 '17 at 09:59
  • 3
    @deceze, I've checked in the benchmarks above (see the results) and try-catch is really-really bad there (~50x slower). To be honest, I never suspected it hits performance so much... I thought this is a myth nowadays and that the try-catch performance is fixed in the latest versions. But no. Still slow :) – noomorph Sep 21 '17 at 11:43
  • 1
    @noomorph in terms of performance, modifying the es6 weakset approach with short circuiting instead of the ternary operator seems to be the fastest, if you don't care about the two edge cases mentioned in my answer: https://measurethat.net/Benchmarks/ShowResult/6567 – Patrick Roberts Sep 21 '17 at 14:53
  • @PatrickRoberts, thanks. I've added your short circuiting approach to the benchmark results in the question details. By the way, it is not so fast on Firefox. So, from what I see the most balanced are: *canBeAddedToWeakSet_from_es6_weakset* and *canBeAddedToWeakSet_3eq_typeof_x2*. And special thanks for looking into `document.all` thing! I appreciate that very much! – noomorph Sep 28 '17 at 07:18

3 Answers3

2

Refer to ECMAScript 2015 (6th Edition, ECMA-262) WeakSet.add()

23.4.3.1 WeakSet.prototype.add ( value )

The following steps are taken:

  1. Let S be the this value.

  2. If Type(S) is not Object, throw a TypeError exception.

  3. If S does not have a [[WeakSetData]] internal slot, throw a TypeError exception.

  4. If Type(value) is not Object, throw a TypeError exception.

  5. Let entries be the List that is the value of S’s [[WeakSetData]] internal slot.

  6. Repeat for each e that is an element of entries,

    • If e is not empty and SameValue(e, value) is true, then
      • Return S.
  7. Append value as the last element of entries.

  8. Return S.

As item 4 said, it checks the type (ECMAScript Language Types) of value is Object or not, because typeof null is 'object' too, so function canBeAddedToWeakSet should be:

function canBeAddedToWeakSet(value) {
    return value !== null && (typeof value === 'object' || typeof value === 'function');
}
kite.js.org
  • 1,599
  • 10
  • 11
2

According to Table 35 of the ECMAScript Specification, Section 12.5.5.1, the typeof operator is allowed to return implementation-dependent strings for Host objects, which are Type Object, and although implementations are encouraged to return "object" in those cases, they are not required to do so.

Implementations are discouraged from defining new typeof result values for non-standard exotic objects. If possible "object" should be used for such objects.

An interesting exception to Table 35 is the case of document.all. The willful violation of the specification was documented with this reasoning:

This violation is motivated by a desire for compatibility with two classes of legacy content: one that uses the presence of document.all as a way to detect legacy user agents, and one that only supports those legacy user agents and uses the document.all object without testing for its presence first.

My recommendation would be to verify that the typeof value does not return the subset of strings that are guaranteed to not satisfy Type(value) of Object, and optionally check for the existence of document.all:

function canBeAddedToWeakSet(value) {
  switch (typeof value) {
    case 'undefined':
      return value !== undefined
    case 'boolean':
    case 'number':
    case 'string':
    case 'symbol':
      return false
    case 'object':
      return value !== null
    default:
      return true
  }
}
Patrick Roberts
  • 49,224
  • 10
  • 102
  • 153
  • 1
    Thanks for bringing up this version. Looks really good too. Nice remark regarding `document.all`, by the way. – noomorph Sep 21 '17 at 10:31
2

Disclaimer: This is micro-optimization. Performance is the enemy to every other quality. In some situations it needs to be prioritized, but most of the time, you should prioritize simplicity + readability.

If you really need to sanity check before inserting into WeakSet...

The specs for WeakSet say that only an Object can be inserted into a WeakSet. So the question is really: "How do you check if a value is an Object or not?".

Following the implementation for es6.weak-set.js in the core-js library (which babel-polyfill uses), there's a transitive reference to _is-object.js, which checks if a value is an object or not:

module.exports = function (it) {
    return typeof it === 'object' ? it !== null : typeof it === 'function';
};

So, rest assured, your canBeAddedToWeakSet is just about as good as it gets (although, I'd consider this ternary optimization and renaming it to isObject).

Update: You may also want to consider the logical optimization mentioned in the comments (high-five @Patrick Roberts!).

When should you not sanity check before inserting into [any collection]?

If you have a strict real-time performance requirement, you may want to consider avoiding this check on every insert. If this is your case, here are your options:

1) Check on every insert

Pros: Protects the caller form handling the erroneous value

Cons: Performance cost of doing the check and branch prediction

2) Try / Catch

I would be concerned about setting up the overhead of exception handling with a function that needs high performance...

Pros: Protects the caller from handling the erroneous value

Cons: High performance cost of try/catch overhead (worse than if statement)

3) Let the caller handle the erroneous value

This tactic is about reducing the responsibility of the performance critical function. For some callers, this check may be totally unnecessary. For instance, in one domain, I may know for a fact that I will only ever have objects or nulls. In that case calling add(myWeakset, value || {}) is my best performance option.

Nevertheless, some domains may need the ability to addSafely, so it might be beneficial to expose both methods...

Pros: Caller gets to maximize performance for their domain

Cons: Risk of irresponsible caller and increase in complexity

souldzin
  • 1,428
  • 11
  • 23
  • 1
    Not that it would make any difference, but an interesting observation is that you could change the ternary `? ... :` to `&& ... ||` and the logical short circuiting would implement the same branching condition as the ternary operator. – Patrick Roberts Sep 21 '17 at 14:13
  • Actually, to my surprise, using short circuiting instead of the ternary operator resulted in an approximately 17% increase in performance: https://measurethat.net/Benchmarks/ShowResult/6567 – Patrick Roberts Sep 21 '17 at 14:51
  • @PatrickRoberts, nice catch + thanks for running the benchmark! – souldzin Sep 21 '17 at 14:56