8

I am trying to write a function to check whether a list is sorted (returning True or False). How can I avoid multiple variables pointing to the same thing?

def is_sorted(t):
    a = t
    a.sort()

When I do that, it sorts both a and t. How can I avoid this?

Johnny
  • 233
  • 3
  • 4
  • 6
  • 1
    "How can I avoid this?" By finding a better design. This is -- perhaps -- the slowest way to do this. The only thing which could be slower would be implementing your own version of `sort`. – S.Lott Jan 06 '11 at 21:52
  • There's something epistemological about the answers that involve sorting the list: it is completely unlikely that the program will have any use for a result different than `True` after the list has already been sorted. Only in very rare cases would the answer to the original `is_sorted()` query still be relevant. Besides, sorting an already sorted list is close enough to O(n), so Why bother querying, instead of just going ahead and calling `sort()` or `sorted()`? – Apalala Jan 07 '11 at 01:43

7 Answers7

9

Here is the O(n) way to do it

>>> from itertools import islice, izip
>>> def is_sorted(L):
...     return all(i<=j for i,j in izip(L, islice(L,1,None)))
... 
>>> is_sorted(range(50))
True
>>> is_sorted(range(50)+[20])
False

It shortcircuits, so if the list is unsorted right near the beginning it will be very fast

Here is a simple program to compare some of the alternatives

import random
import time
from itertools import islice, izip

def is_sorted1(L):  # 0.0006s
    return all(i<=j for i,j in izip(L, islice(L,1,None)))

def is_sorted2(L):  # 0.117s
    return all(L[i] < L[i+1] for i in range(len(L)-1) )

def is_sorted3(L):  # 2.344s
    return L == sorted(L)

def is_sorted4(L):  # 0.0002s
    return all(L[i] < L[i+1] for i in xrange(len(L)-1) )

A = [range(random.randrange(100000)) for i in range(100)]
for a in A:
    random.shuffle(a)

for f in is_sorted1, is_sorted2, is_sorted3, is_sorted4:
    s=time.time()
    for a in A:
        f(a)
    print time.time() - s
John La Rooy
  • 295,403
  • 53
  • 369
  • 502
  • Well, I doubt Johnny understands what this does ;p Maybe this helps: `izip(L, islice(L,1,None))` slides a window over `L`. It gives a sequence like `[(L[0],L[1]), (L[1],L[2]), ...]` – Jochen Ritzel Jan 06 '11 at 21:11
  • It may be just nit-picking, but indexing a Python list should be much cheaper than slicing and zipping it. – Apalala Jan 06 '11 at 21:14
  • 2
    @Apalala: "should be much cheaper"? Please use `timeit` and post a comparison showing what really happened, not what *should* happen. – S.Lott Jan 06 '11 at 21:50
  • @Apalala: indexing is faster using _CPython2.6_ on _this_ computer, but note that it was essential to use `xrange` intead of `range` – John La Rooy Jan 06 '11 at 22:19
  • @S.Lott One shouldn't need to measure everything. The performance characteristics of the Python interpreter and standard libraries are well documented, and Big O tells the rest. My using `range` instead of `xrange` was a novice's mistake (xrange has been deprecated in Python 3). – Apalala Jan 06 '11 at 23:05
  • 3
    @Apalala: `range` vs. `xrange` is well known. "indexing a Python list should be much cheaper than slicing and zipping it" is not as well known. Without actual numbers, it's -- in effect -- conjecture. With actual numbers it's theory + evidence == science. – S.Lott Jan 07 '11 at 00:45
  • @S.Lott Much better known is the fact that verifying if a sequence is sorted should be much cheaper than sorting it. ;-) – Apalala Jan 07 '11 at 18:56
6

EDIT: Please see this answer for the proper way to do it. I'll leave my answer here for posterity (what's the correct way of handling this situation?), but it should not be considered the best answer to the question, even if it is a correct answer to the question.


To answer your specific question, you can use copy.copy (or use the slice syntax [:]) to create a copy of the original list:

import copy

def is_sorted(t):
    a = copy.copy(t) # could also do: a = t[:]
    a.sort()
    return a == t

However, a better method would be to use the sorted function to return a sorted copy of the list:

def is_sorted(t):
    return sorted(t) == t

Or: is_sorted = lambda t: sorted(t) == t

Community
  • 1
  • 1
Alex Vidal
  • 4,080
  • 20
  • 23
  • `deepcopy` is overkill given the specs. `copy` or `[:]` will suffice as the items in the list are not being modified. – aaronasterling Jan 06 '11 at 20:21
  • @aaronasterling: Good point. I'll modify my answer to remove the deepcopy references. – Alex Vidal Jan 06 '11 at 20:22
  • 1
    Testing if a sequence is sorted is O(n) comparisons and O(0) swaps. Sorting for the sane test is O(n log(n)). – Apalala Jan 06 '11 at 20:31
  • 1
    @Apalala, While python code remains about 100 times slower than C code, it is often faster to abuse sort this way in practice because the log(n) doesn't catch the 100 until n is very large. If the list is very unsorted, a python loop should be much faster though – John La Rooy Jan 06 '11 at 20:44
  • @gnibbler A swap costs orders of magnitude more than a comparison, and the O(n) is only for totally sorted sequences. Theres a big constant that multiplies the n log(n) when sorting. The solution you gave in your answer is the correct one. – Apalala Jan 06 '11 at 21:12
3

by creating a copy of your list with something like this:

def is_sorted(t):  
    a = t[:]  # create a copy
    a.sort()
mouad
  • 67,571
  • 18
  • 114
  • 106
3

Credit for this minimal answer belongs to @gnibbler

is_sorted = all(q[i] < q[i+1] for i in xrange(len(q)-1) )
Apalala
  • 9,017
  • 3
  • 30
  • 48
  • 1
    this iterates the entire list even if the first element is larger than the second. You can use `all()` here instead of reduce and the code will be clearer too. `all(q[i] < q[i+1] for i in range(len(q)-1))` – John La Rooy Jan 06 '11 at 20:52
  • @gnibbler It doesn't iterate the whole list because it is a generator expression (not a list generator), and the `and_` operator evaluates in short-circuit fashion: it will consume the input iterator **while** it finds `True`. A solution with `all` is better. – Apalala Jan 06 '11 at 20:57
  • 1
    `reduce()` does not know that it can stop when it finds True, so the generator expression would need to complete to the end. – John La Rooy Jan 06 '11 at 21:04
  • +1: Don't fool around sorting and comparing. Just evaluate the necessary predicate. Much faster. And simpler, too. – S.Lott Jan 06 '11 at 21:46
  • @gnibbler You're right about `reduce()`. My answer had already been upvoted, so I corrected it. The comments and your answer contain the trace of the changes, except for my original `reduce(operator.and_, q[i] < q[i+1] for i in range(len(q)-1))` – Apalala Jan 06 '11 at 23:09
2

You can either make a copy of t and then sort, like so: a = t[:], or use sorted, which returns a new sorted list.

Alternatively, you could make use of the sortedlist structure from blist, then your list will always be sorted.

Seth
  • 45,033
  • 10
  • 85
  • 120
0

This is a very poor way, in terms of performance, to check that a list is ordered. You should instead iterate over it checking that each element is greater or equal to the previous.

David Heffernan
  • 601,492
  • 42
  • 1,072
  • 1,490
  • 1
    Heffeman, theoretically, you're right. However Python's sort implementation is so fast compared to actual python code that just comparing a list to a sorted version of itself is oftentimes faster. – aaronasterling Jan 06 '11 at 20:25
  • @aaron Point taken. There ought to be built-in function that does this that would avoid the overhead of creating a copy of the list just to check that it was monotone! Oh, it makes me feel dirty just thinking about it. – David Heffernan Jan 06 '11 at 20:28
  • It might be nice in some cases but I understand why it's not there. Most of the time, if one cares to know if a sequence is sorted, it's because one wants the sequence sorted in which case, one can just use `list.sort` or `sorted`. If it's a sort only if unsorted situation, then TimSort (the sorting algorithm used by Python) is O(n) on a sorted list which is what checking for monotonicity is to begin with. So again, just sort the list. – aaronasterling Jan 06 '11 at 20:38
0

This is the easiest way to do it... as far as I know

def is_sorted(t):
    a = list(t)
    a.sort()