5

Hi I was solving this question on leetcode [Given a list of non-negative integers, arrange them such that they form the largest number.] I saw this solution. I'm unable to understand how is the class LargerNumKey is working? Also, what is the purpose lt . and what are the variables x and y

class LargerNumKey(str):
    def __lt__(x, y):
        return x+y > y+x
        
class Solution:
    def largestNumber(self, nums):
        largest_num = ''.join(sorted(map(str, nums), key=LargerNumKey))
        return '0' if largest_num[0] == '0' else largest_num
Atharv Sharma
  • 125
  • 2
  • 8
  • 3
    Here are the [Python docs for `__lt__`](https://docs.python.org/3/reference/datamodel.html#object.__lt__). – jkr Jul 29 '20 at 23:55
  • 2
    What don't you understand from the documentation and the code here? Please be specific enough to make this a viable Stack Overflow question. – Prune Jul 29 '20 at 23:56

3 Answers3

9

The __lt__ "dunder" method is what allows you to use the < less-than sign for an object. It might make more sense written as follows:

class LargerNumKey(str):
    def __lt__(self, other):
        return self+other > other+self

# This calls into LargerNumKey.__lt__(LargerNumKey('0'), LargerNumKey('1'))
LargerNumKey('0') < LargerNumKey('1')

Behind the scenes when str is subclassed, adding self+other actually generates a str object rather than a LargerNumKey object, so you don't have infinite recursion problems defining an inequality on a type in terms of its own inequality operator.

The reason this works is perhaps more interesting:

  1. The first fact we need is that for any positive integers we actually have (x>y) == (str(x)>str(y)), so when the custom __lt__ is operating it's actually asking whether the integers represented by those string concatenations are greater or less than each other.
  2. The second interesting fact is that the new inequality defined as such is actually transitive -- if s<t and t<u then s<u, so the sorted() method is able to place all the numbers in the correct order by just getting the correct answer for each possible pair.
Hans Musgrave
  • 6,613
  • 1
  • 18
  • 37
  • 2
    Yeah, *definitely* should be called `self` and `other`. Really helps to show what's going on here. – superb rain Jul 30 '20 at 00:20
  • can you please explain the reason 1 again. I'm unable to understand it. with input [1,6,4,2,6,9] and changing __lt__ to __gt__ gives o/p 124669. with __lt__ it is giving correct o/p 966421. why does this happens? – Atharv Sharma Jul 30 '20 at 04:45
  • Swapping lt with gt reverses the order if the implementation stays the same -- that's a general property of inequalities and not specific to python. Explaining point 1 in more detail, string comparison for two strings of digits of equal length checks one letter at a time if the ascii index of the digit in question is greater. Ascii was designed so that this corresponds exactly to whether two digits are greater. Hence, `'923' > '458'` is exactly the same as asking `923 > 458`. That's all I was trying to say with point (1). – Hans Musgrave Jul 30 '20 at 04:56
  • thnx for the explanation. the other thing which I am unable to understand is that why only __lt__ is used here. and how __lt__ and return self+other > other+self are working together? – Atharv Sharma Jul 30 '20 at 05:16
  • Imagine your inputs are `'3' and '4'`. When `sorted()` compares those, it calls `__lt__('3', '4')`, which checks if `'34' > '43'`. The answer is false, so `sorted()` concludes that `'3'` is NOT less than `'4'` and instead places `'4'` before `'3'`. I.e., `sorted(['3', '4']) == ['4', '3']` using the `LargerNumKey` subclass. The whole point is to for any pair of numbers figure out if they're bigger left-to-right or right-to-left. It takes a little more work to show that sorting ALL the inputs based on that idea gives the right answer, but that's also true. – Hans Musgrave Jul 30 '20 at 05:21
6

__lt__ is a magic method that lets you change the behavior of the < operator. sorted uses the < operator to compare values. So when python is comparing two values with < it checks to see if those objects have the magic method __lt__ defined. If they do, then it uses that method for the comparison. The variables x and y in the example are the two variables being compared. So if you had a line of code like x < y, then x and y would be passed as arguments to __lt__. Sorted presumably does have that line of code. But you don't have to call them 'x' and 'y', you can call them whatever you want. Often you will see them called self and other.

sorted works by comparing two items at a time. For example, let's call them x and y. So somewhere sorted has to compare them, probably with a line that looks like:

if x < y:

However, if you pass sorted a key argument, then it instead compares them more like this:

if key(x) < key(y):

Since the example passes LargerNumKey as the key, it ends up looking like this after python looks up key:

if LargerNumKey(x) < LargerNumKey(y):

When python then sees the < operator, it looks for the __lt__ method, and because it finds it turns the statement into basically:

if LargerNumKey(x).__lt__(LargerNumKey(y)):

Because __lt__ is a method on an object, the object itself becomes the first argument (x in this case). Also, because LargerNumKey is a subclass of str it behaves exactly like a regular string, except fo the __lt__ method that you overrode.

This is a useful technique when you want things to be sortable. You can use __lt__ to allow your objects to be sorted in any way you wish. And if the objects you are sorting have the __lt__ method defined, then you don't have to even pass key. But since we are working with different types of objects and don't want to use the default __lt__ method, we use key instead.

References:

Note that while my example pretends that sorted is python code, it is in fact usually c code. However, since python is "pseudo code that runs", I think it conveys the idea accurately.

Garrett Motzner
  • 3,021
  • 1
  • 13
  • 30
1

This'd also pass without __lt__:

from functools import cmp_to_key


class Solution:
    def largestNumber(self, nums):
        nums = list(map(str, nums))
        nums.sort(key=cmp_to_key(lambda a, b: 1 if a + b > b +
                                 a else -1 if a + b < b + a else 0), reverse=1)
        return str(int(''.join(nums)))


print(Solution().largestNumber([10, 2]))
print(Solution().largestNumber([3, 30, 34, 5, 9]))

Outputs

210
9534330

References

  • For additional details, you can see the Discussion Board. There are plenty of accepted solutions with a variety of languages and explanations, efficient algorithms, as well as asymptotic time/space complexity analysis1, 2 in there.
Emma
  • 27,428
  • 11
  • 44
  • 69
  • Actually, this is the correct way. The solution in the description will be a headache if there are sorted() needed in multiple places with a different implementation. – Ijaz Sep 26 '20 at 19:55
  • @Ijaz The question though wanted clarity, not improvement. However, could you explain a bit more why you prefer this solution? Pros/cons? – Garrett Motzner Sep 30 '21 at 19:45