5

Let's say I have a comparator that uses both values of the comparison to decide a sort.

For example, in this problem we have to use a comparator function that uses both elements:

def comparator(a, b): 
    ab = str(a) + str(b) 
    ba = str(b) + str(a) 
    return ((int(ba) > int(ab)) - (int(ba) < int(ab))) 

Can this be written in the key=lambda x: ... format when sorting?

I'm aware that a cmp_to_key function exists to convert cmp functions to key functions. My question is whether we can write it as a key function without having to convert it that way.

kaya3
  • 47,440
  • 4
  • 68
  • 97
m0etaz
  • 953
  • 1
  • 8
  • 21
  • 1
    That's all `cmp_to_key` does: it doesn't really "convert" anything; it just defines a new class that implements all of the comparison operators in terms of the given comparator. – chepner Feb 10 '20 at 21:06
  • Well, I believe you can convert this code into a lambda function, but the code would be nowhere as readable. – Anis R. Feb 10 '20 at 21:18
  • @GreenCloakGuy No, because the only answer is specific to python2, and this question specifies python3 (where `sorted` no longer has a `cmp` argument). – ekhumoro Feb 10 '20 at 21:48
  • This one, then? https://stackoverflow.com/questions/2531952/how-to-use-a-custom-comparison-function-in-python-3 – Green Cloak Guy Feb 10 '20 at 21:48
  • @GreenCloakGuy There's only one answer shown there that doesn't use `cmp_to_key`, and that can't be used to solve the OPs problem. – ekhumoro Feb 10 '20 at 21:58
  • Does this answer your question? [Sort a list to form the largest possible number](https://stackoverflow.com/questions/30140796/sort-a-list-to-form-the-largest-possible-number) – Kelly Bundy Feb 11 '20 at 00:20
  • If `cmp` cannot be replaced by `key`, then why are they deprecated in python3. Why is it hard to custom sort in python as compared to other languages. – Vipul Apr 17 '22 at 16:03

1 Answers1

3

Since you've ruled out solutions that work like functools.cmp_to_key (i.e. declaring a class with dunder comparison methods), I infer that you want a way to do this where the key function only uses built-in types.

The problem is that logically, the key for a number like 25 is the infinite sequence (2, 5, 2, 5, ...). The reason you need an infinite sequence is that you could always come across a number like 252525252525253, where the result of the comparison depends on the last digit because the rest of the digits are that repeated sequence from the other number.

If you have a bound on the size of the input numbers (say we know they're all at most 10 digits), then the sequence only needs to be repeated up to 10 digits length, and then lexicographic comparison (e.g. of strings or tuples) will work:

def key_func(n, digits=10):
    s = str(n)
    return (s * digits)[:digits]

As Stefan Pochmann points out, however, there is a built-in type Fraction which can be used to represent an infinite repeating sequence of digits: for example, (2, 5, 2, 5, ...) is represented as Fraction(25, 99) since its decimal expansion is 0.2525.... Comparing fractions is equivalent to comparing their decimal expansions lexicographically:

from fractions import Fraction

def key_func(n):
    k = len(str(n))
    return Fraction(n, 10**k - 1)

Examples (same for both key functions):

>>> sorted([54, 546, 548, 60], key=key_func)
[54, 546, 548, 60]
>>> sorted([1, 34, 3, 98, 9, 76, 45, 4], key=key_func)
[1, 3, 34, 4, 45, 76, 98, 9]
>>> sorted([25, 252525251], key=key_func)
[252525251, 25]
>>> sorted([25, 252525253], key=key_func)
[25, 252525253]

The resulting lists should be joined in reverse order to produce the largest possible number by concatenation, so e.g. [54, 546, 548, 60] maps to 6054854654.

kaya3
  • 47,440
  • 4
  • 68
  • 97
  • You could pre-compute the default argument for `digits` with `max(lst, key=lambda i: len(str(i)))`. – ekhumoro Feb 11 '20 at 00:01
  • 1
    You can easily represent that infinite sequence as `Fraction(25, 99)`. (Only [partially](https://stackoverflow.com/questions/30140796/sort-a-list-to-form-the-largest-possible-number#comment48455555_30152085) my idea, though) – Stefan Pochmann Feb 11 '20 at 00:26
  • @StefanPochmann Interesting; I had considered `[2, [5, [...]]]` with a self-reference but that overflows the stack if you compare two structurally-equal but different-identity lists. Very clever to use fractions. – kaya3 Feb 11 '20 at 00:31
  • "The resulting lists should be joined in reverse order", you mean like `sorted(..., reverse=True)`? You might as well add it to your code snippet. – Tadhg McDonald-Jensen Feb 11 '20 at 01:05
  • The fractions solution is cute, but it doesn't answer the original question, which was: "Can any comparator function be written in key format?". So the OP was asking about the **general case**, and then goes on to give *one possible example* to illustrate it. The subsequent edits don't properly reflect the OPs original intention. So I think it would be good to explain exactly why, in general, key functions don't always work well as comparators. – ekhumoro Feb 11 '20 at 01:11
  • @ekhumoro I am confused; in the comments under the question, a duplicate was proposed, and you commented *"There's only one answer shown there that doesn't use cmp_to_key, and that can't be used to solve the OPs problem"*. So I edited to clarify your point about it not being a duplicate because a solution to the specific problem is required. As I see it, the question was not clearly asking about the general case in the first place anyway, but we can see what the OP says; I'm happy for my edit to be reverted if I missed the point. – kaya3 Feb 11 '20 at 01:19
  • @kaya3 Neither of those dups can be used to solve the OPs problem, because they are both specific to one particular use-case. I may have read the OP wrong (in which case this is all moot), but I think they are trying to understand why key functions can't always be used as comparators. Key functions work by transforming their inputs into proxy objects that compare in the desired way. There are many built-in types (such as `Fraction`) that can act as proxies for specific use-cases. But for the general case, a *universal proxy* is required - and `cmp_to_key` provides exactly that. – ekhumoro Feb 11 '20 at 02:19