0

I was solving a hackerrank challenge.(in Python 3) At one point I needed to sort a list and work with it. (2 <= len(array) <= 10^5) The list is containing unique integers. I used arr.sort(), but I was getting "Time limit exceeded" message Then I found if I use set(arr), it runs fast and passes all test cases. here's is my function for more clarification

def pairs(k, ar):
  ar.sort()
  c=0
  for i in ar:
    if i+k in ar:
      c+=1
  return c

This gets me "Time limit exceeded"

def pairs(k, ar):
  ar=set(ar)
  c=0
  for i in ar:
    if i+k in ar:
      c+=1
  return c

This one with set() runs faster. Why is this set() is faster than sort(). Don't they use same sorting algorithm?

  • 1
    Because the one with set is linear time, the one without it is quadratic – juanpa.arrivillaga Sep 15 '21 at 04:46
  • Until Python 3.6, sets were not sorted. Since then, they are. – Tim Roberts Sep 15 '21 at 04:49
  • 4
    @TimRoberts `set` objects are absolutely not sorted in Python, not even as an implementation detail in any of the major implementations – juanpa.arrivillaga Sep 15 '21 at 04:51
  • 4
    "Why is this set() is faster than sort(). " It's not either of these calls that is fast or slow here; it's *the `in` operator*. `set` doesn't use a sorting algorithm at all, and produces something completely different. That said, the reason that sorting the list gives you a much slower result than using a set is that *sorting the list doesn't help `in` work at all*; the `in` operator has no way to know that the list has been sorted, and therefore every element is still checked every time. – Karl Knechtel Sep 15 '21 at 04:51
  • @Loocid I'm new in python and I used set so many times to get sorted items and it worked. Also `set([54, 23]) = {23,54}` here in my ide – Rifat Bhuiyan Sep 15 '21 at 04:52
  • 1
    @TimRoberts are you perhaps thinking of dicts (which *preserve their insertion order* - not the same thing)? – Karl Knechtel Sep 15 '21 at 04:52
  • 1
    "Don't they use same sorting algorithm?" set objects don't use any sorting algorithm at all – juanpa.arrivillaga Sep 15 '21 at 04:52
  • 3
    @RifatBhuiyan yeah, that's completely incorrect. What you are seeing is an artifact of the fact that `int` objects (mostly) hash to themselves – juanpa.arrivillaga Sep 15 '21 at 04:53
  • @KarlKnechtel yea and important to not `set` objects dont' maintain insertion order – juanpa.arrivillaga Sep 15 '21 at 04:54
  • 2
    @KarlKnechtel: Note that a presorted `list` *could* gain some speed on containment checks (going from `O(n)` to `O(log n)`) by using the `bisect` module functions. But sorting (`O(n log n)`) plus repeated binary search (`O(log n)`) is almost certainly going to be much slower than conversion to `set` (`O(n)`) plus repeated `set` lookup (average case `O(1)`). But yeah, `in` on a sorted `list` and `in` on an unsorted `list` have identical performance (only way they'd differ in practice is if the thing being searched for was likely to be bigger or smaller than most elements of the `list`). – ShadowRanger Sep 15 '21 at 04:59
  • 2
    @RifatBhuiyan: `==` on `set`s compares them w/o regard to iteration order, so that would be true even if the `set`s in question had the same values but iterated in different orders. Try it with `{-2, -1}` and `{-1, -2}`. They iterate in different orders (because I chose two values that, as an implementation detail of CPython, have identical hashes, so the first seen occupies its desired bucket, while the second hops around to find an open one; other values do this too, they're just a little harder to find), but they're still equal to each other. `set` doesn't sort, and has no useful ordering. – ShadowRanger Sep 15 '21 at 05:03

0 Answers0