-1

Problems with comparator functions

  1. Behavior is undefined when the function is not transitive, or if it's not antisymmetric.

  2. The expected return value, an int in the set {-1, 0, 1}, is awkward

  3. It's longer than the keyfunc approach and uses imperative style when declarative style would do nicely

In JavaScript

Isn't the underscore function _.sortby(myArray, keyFunc) always better than myArray.sort(comparatorFunc)?

Well, if only there weren't these two problems in JS:

  1. Unfortunately _.sortBy has no "reverse" argument when you want to sort in reverse order of the key.

  2. _.sortBy doesn't treat array keys as having lexicographical-ordering semantics, it just uses the native JS < operator which converts them to a string, so there's no good way to do arbitrary lexicographic orderings.

In Python

The more recently implemented mylist.sort(key=keyfunc, reverse=True/False) seems always better than the older mylist.sort(cmp=comparatorfunc) approach.

Liron Shapira
  • 1,967
  • 1
  • 14
  • 17
  • 3
    If you think including a library is better than a native method, sure it is! Underscore probably does the same thing internally anyway, it just wraps it neatly in a function and adds 5kb. – adeneo Aug 02 '14 at 23:08
  • 2
    No; the Underscore sort by key function is a terrible API mistake. Consider for example how you might use that to sort by a string key in *reverse* order. – Pointy Aug 02 '14 at 23:16
  • How would you use `_.sortBy` with two sort keys? `_.sortBy` cannot possible be "always better" because it doesn't even support all the functionality of `Array.prototype.sort`. – mu is too short Aug 02 '14 at 23:44
  • @user2864740 the burden of proof is on the comparator function because it has a nonstandard return type {-1, 0, 1}, it has undefined behavior when not transitive or not antisymmetric. Just seems like a historical accident to me. – Liron Shapira Aug 02 '14 at 23:44
  • @Pointy It is not a "terrible mistake" at all - while it should have a *supporting* full comparison version, a key-function selector is often sufficient and more clear/concise. Also, arguably the sort order should/could be *separate* from the comparator function. – user2864740 Aug 02 '14 at 23:45
  • @muistooshort at least in Python this is not an issue because you don't really want "two sort keys" you want a lexicographic ordering, so you just have your key function return a tuple of (key1, key2). This is the most elegant way to declaratively express what you want. – Liron Shapira Aug 02 '14 at 23:46
  • @LironShapira Thanks! It's a much better question now. – user2864740 Aug 02 '14 at 23:54
  • 1
    The Python developers agree that key functions beat comparators, which is why `cmp` is gone in Python 3. Underscore's `sortBy` looks significantly less powerful, though, since Javascript doesn't have nice things like lexicographically-compared tuples. – user2357112 Aug 03 '14 at 00:04
  • But this isn't Python so why does it matter how Python does thing? You can't `array1 < array2` in any sensible fashion in JavaScript so `_.sortBy` is very limited. – mu is too short Aug 03 '14 at 00:11
  • @muistooshort Ok yes, if you're hacking up something real quick in JavaScript, you have no built-in support for a keyfunc that returns an array with lexicographic-ordering semantics. But if you're writing anything more than that, I say write/import a library function with a keyfunc, a reverse argument and good lexicographical ordering support, rather than writing a single comparator function in your app-level code. – Liron Shapira Aug 03 '14 at 00:25
  • @user2357112 I know right? I think it needs to be better publicized that comparator functions were a design mistake. – Liron Shapira Aug 03 '14 at 00:26
  • 2
    @LironShapira I don't hold that to be true - and "be better publicized that comparator functions were a design mistake" is an opinion stated too much like a fact. A comparator function provides a stable ordering between two values. This provides an ordering independent of any other tricks or niceties (which I *do* use and appreciate); that is, a comparator function is more general than a key-selector and works across ADTs needing an ordering (how are tuples themselves ordered?). – user2864740 Aug 03 '14 at 01:12
  • @LironShapira In my work, C#, I use *both* comparator's and key-selectors. The form used is based on which is most appropriate (I am thankful there is both!), and the world keeps running – user2864740 Aug 03 '14 at 01:19
  • @user2864740 Show me one real example of when you chose to use a comparator over a keyfunc. – Liron Shapira Aug 03 '14 at 14:37

1 Answers1

1

A sort using a comparator1 is more universal and flexible than one which merely uses a key-selector. However, this does not mean that one approach is necessarily "better" and it comes down to implementation and use.

First off, a comparator has one purpose - to establish ordering between to arbitrary values. This operation is used by the sort function, and such a means of ordering values must exist at some level.

Consider that a key-selector or merely a handy convenience to define such a comparator, where <=> defines some natural ordering (note that <=> is itself a comparator!). Thus, the key-selector is actually an input transformation to a built-in comparator.

def comparator (a, b): 
    return a.x <=> b.x              -- explicit value selection

def comparatorForKeySelector (keyFn, a, b):
    return keyFn(a) <=> keyFn(b)    -- keys selected by function

As stated at the start, the selection comes down to implementation and use.

The first candidate, underscore.js's sortBy happens to be a limited implementation; because the underlying comparator does not understand how to order returned arrays/sequences it is not possible to sort on multiple values without hacks such as string munging. It also fails to provide the ability to reverse the sort. These are implementation issues which could both be addressed with an appropriately modified built-in comparator.

The second key-selector candidate, Python's keyed-sort, can trivially sort on multiple keys, thanks to ordering between tuples. It also allows the sort direction to be specified making it a much better implementation. However, it does not allow sort direction to change between the keys (e.g. sort the first key ascending the second descending), which is also due to the ordering between tuples.

Another key-selector candidate, C#'s LINQ which has neither of the limits above because of chaining with explicit sort directions: eg. people.OrderBy(p => p.Height).ThenByDescending(p => p.Weight). I consider this the "ideal" key-selector implementation, albeit one that relies on some pretty nifty underlying support.

After looking at the three different implementations, noting that they can solve a common class of problems with varying degrees of success and flexibility, I refer back to my introduction: a comparator is more universal than a key-selector and I think the "better" solution is to support both [well], such that a particular problem can be solved in the cleanest way possible.


1 I am using the term comparator to consistent refer to a function - possibly an operator - that establishes ordering between two values. At the primitive level this might be implemented merely with a relational and equality operator pair.


A la carte:

Behavior is undefined when the function is not transitive, or if it's not antisymmetric.

Then the function is broken and should be fixed. Broken code is broken code and cannot be used as an argument against using a comparator.

The expected return value, an int in the set {-1, 0, 1}, is awkward

This is a pretty natural way to express this; many sorts accept "less than zero", "zero" and "more than zero" values. These could be typed more explicitly - eg. with constants - but have no bearing on the semantics of such a function.

It's longer than the keyfunc approach and uses imperative style when declarative style would do nicely

A comparator is not implicitly an "imperative style". In a functional language it will be a "function style". The key-selector is often more terse and clear; but this can be argued for most specialized API.

user2864740
  • 60,010
  • 15
  • 145
  • 220
  • 2
    Strictly speaking, comparators aren't more universal or flexible than key functions, since you can turn any comparator into a key function that returns a wrapper object compared by the given comparator. It's about as conceptually complex as the transformation from a key function to a comparator that extracts the key and compares by it. – user2357112 Aug 03 '14 at 07:57
  • @user2357112 Hmm, I'm not quite sure I follow. My argument is that a key-function *must use* a comparator, and that such a comparator used can be supplied manually - the key-selector being a nice transformation for such. (So there is no reason to go comparator -> key-selector -> comparator.) – user2864740 Aug 03 '14 at 08:01
  • 1
    I'm just pointing out that there's nothing you can do by supplying a comparator to the sort that you can't do by supplying a key function. You can define a `keySelectorForComparator` analogous to your `comparatorForKeySelector`; Python actually [comes with that built-in](https://docs.python.org/2/library/functools.html#functools.cmp_to_key). – user2357112 Aug 03 '14 at 08:14
  • (Well, you've got a keyed comparator there rather than a comparator factory, so it's not quite analogous to `functools.cmp_to_key`. If it was `comparatorForKeyFunction(keyFn)(a, b)` rather than `comparatorForKeyFunction(keyFn, a, b)`, it'd be analogous.) – user2357112 Aug 03 '14 at 08:24
  • 1
    @user2357112 In Python, with a key-selector, how is it possible to sort ascending on one key and descending on another? [This is best I could find](http://stackoverflow.com/questions/11476371/sort-by-multiple-keys-using-different-orderings), but I don't regularly use Python these days. – user2864740 Aug 03 '14 at 08:48
  • @user2357112 Oh, hmm (looking at cmp_to_key). Still, it seems roundabout *and* it uses a comparator: comparator -> key-selector -> comparator (built-in). I can see your point about the general-ness, but it doesn't really use the key-selector except as a "proxy" of the original comparator. – user2864740 Aug 03 '14 at 08:50
  • Generally, you'd use two sorts. If there were a built-in key function for reversed ordering, you'd use `lambda x: (key1(x), reverseorder(key2(x)))`, but we don't have such a built-in, probably for one or both of the following reasons: no one wanted it enough, or it duplicates what you can already do with multiple sorts. – user2357112 Aug 03 '14 at 08:53
  • @user2357112 Yes the correct way to specify your desired ordering by composing the concept of "lexicographically-ordered tuples" with the concept of a tuple-element-specific "reverseorder" flag. However you end up implementing "reverseorder" will be cleaner than any attempt that begins with the clunky comparator abstraction. – Liron Shapira Aug 03 '14 at 14:40
  • @user2864740 My original question is what's the best approach for specifying a desired order relation on an array of elements in code. If you don't agree that keyfuncs are strictly superior to comparators, all you have to do is show the code for a single realistic counterexample. Your current answer, especially the incorrect (imprecise at best) statement about comparators being more "universal and flexible" and the mention of the rock-paper-scissors (which falls outside the documented intended behavior of array-sort functions because it's not a valid ordering relation) aren't helpful. – Liron Shapira Aug 03 '14 at 14:52
  • @LironShapira I've removed the bit about the RPS, I agree that bit was unnecessary. – user2864740 Aug 04 '14 at 00:56