6

In python 3, int(50)<'2' causes a TypeError, and well it should. In python 2.x, however, int(50)<'2' returns True (this is also the case for other number formats, but int exists in both py2 and py3). My question, then, has several parts:

  1. Why does Python 2.x (< 3?) allow this behavior?
  • (And who thought it was a good idea to allow this to begin with???)
  • What does it mean that an int is less than a str?
    • Is it referring to ord / chr?
    • Is there some binary format which is less obvious?
  • Is there a difference between '5' and u'5' in this regard?
  • SilentGhost
    • 307,395
    • 66
    • 306
    • 293
    cwallenpoole
    • 79,954
    • 26
    • 128
    • 166
    • Related question: http://stackoverflow.com/questions/3270680/how-does-python-compare-string-and-int – unutbu Nov 24 '10 at 12:49
    • Python doc -> http://docs.python.org/reference/expressions.html#comparisons – SubniC Nov 24 '10 at 12:51
    • With dictionaries, if you need a comparison function for implementing O(log N) search and you want to be able to mix key types, then the obvious answer is that you need comparisons to work for any combination of values. I wouldn't be surprised if they just figured that using `id` would be better in Python 3. Props to whoever finds the PEP (probably in the 3000 range) for this change. – Mike DeSimone Nov 24 '10 at 13:01
    • 3
      @Mike: Python's dictionaries use hash tables internally, not trees, so that's not the reason. – intgr Nov 24 '10 at 13:08
    • "Why"!?! It's allowed in Python 2 because the developers thought it was a good idea, and it turned out it wasn't, so it's rmeoved in Python 3. That's why. :) – Lennart Regebro Dec 07 '10 at 09:53

    4 Answers4

    8

    It works like this1.

    >>> float() == long() == int() < dict() < list() < str() < tuple()
    True
    

    Numbers compare as less than containers. Numeric types are converted to a common type and compared based on their numeric value. Containers are compared by the alphabetic value of their names.2

    From the docs:

    CPython implementation detail: Objects of different types except numbers are ordered by >their type names; objects of the same types that don’t support proper comparison are >ordered by their address.

    Objects of different builtin types compare alphabetically by the name of their type int starts with an 'i' and str starts with an s so any int is less than any str..

    1. I have no idea.
      • A drunken master.
    2. It means that a formal order has been introduced on the builtin types.
      • It's referring to an arbitrary order.
      • No.
    3. No. strings and unicode objects are considered the same for this purpose. Try it out.

    In response to the comment about long < int

    >>> int < long
    True
    

    You probably meant values of those types though, in which case the numeric comparison applies.

    1 This is all on Python 2.6.5

    2 Thank to kRON for clearing this up for me. I'd never thought to compare a number to a dict before and comparison of numbers is one of those things that's so obvious that it's easy to overlook.

    aaronasterling
    • 68,820
    • 20
    • 127
    • 125
    • 2
      ...Yeah, I mean, that's completely logical, isn't it? – Humphrey Bogart Nov 24 '10 at 12:52
    • Re: `No. strings and unicode objects are considered the same for this purpose. Try it out.` I had, but there is always the chance that my tests had missed something – cwallenpoole Nov 24 '10 at 13:09
    • 1
    • @kRON, Right, and I corrected it. But `(int < long)==False` isn't correct either. – aaronasterling Nov 24 '10 at 13:46
    • @kRON, Also, the general statement was _almost_ right in that it just has to be broken down into two categories where it's inapplicable to one and applies to the other. Namely, it's inapplicable to numeric types but applies perfectly well to the builtin containers. So I really wasn't _that_ wrong. – aaronasterling Nov 24 '10 at 13:49
    • Well, seriously now. Your original answer was way incorrect and your answer is still incorrect. The only reason why `float()==int()==long()` is because numbers are compared arithmetically, or did you expect this not to be true `float(0)==int(0)` and that has nothing to do with type comparisons, `(intfloat)`, so types are ordered arbitrarily in comparisons, but consistently. – Filip Dupanović Nov 24 '10 at 14:10
    • @kRON, I specifically say that numeric types compare based on numerical value. I didn't say that it had anything to do with type comparisons, you're the one that brought that up. Type comparison has _nothing_ to do with instance comparison. I don't know why you think that that's important. Also, thanks for pointing out that my answer was still wrong. I needed to extend the by one more sentence. – aaronasterling Nov 24 '10 at 14:16
    • Your revising too fast for me. Last revision was `True < dict() < list() < str() < tuple()`; It's completely arbitrary within *one* execution block. Any you'll have to revise again because `0<0` is false, and that's what you get from arithmetic comparison. – Filip Dupanović Nov 24 '10 at 14:24
    • 1
      @kRON. What do you mean "last revision was `True < dict() < ...`?" I never put that up. Are you on drugs? Anybody can look at my edit history. In fact, I never compared `True` to any of those because it's an `int`. I've run many execution blocks and the alphabetical ordering of the container types is a pretty well known phenomenon. – aaronasterling Nov 24 '10 at 14:27
    • 1
      Sorry, missing my morning coffee. Well you know `float() – Filip Dupanović Nov 24 '10 at 14:34
    • But if it ever turns out you were incorrect, I **SWEAR** I'll be coming back :P – Filip Dupanović Nov 24 '10 at 14:35
    • @kRON I'm an addict myself. I totally understand. It's 4:38am here and I'm finishing my last cup before bed. :) FWIW, I tracked down the documentation on the ordering of instances and edited it into my post. – aaronasterling Nov 24 '10 at 14:39
    6

    The reason why these comparisons are allowed, is sorting. Python 2.x can sort lists containing mixed types, including strings and integers -- integers always appear first. Python 3.x does not allow this, for the exact reasons you pointed out.

    Python 2.x:

    >>> sorted([1, '1'])
    [1, '1']
    >>> sorted([1, '1', 2, '2'])
    [1, 2, '1', '2']
    

    Python 3.x:

    >>> sorted([1, '1'])
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: unorderable types: str() < int()
    
    intgr
    • 19,834
    • 5
    • 59
    • 69
    • 2
      uhm, it's sorting because it allows comparison, not the other way around. – SilentGhost Nov 24 '10 at 13:12
    • OP asked why does Python allow this. I'm saying that the reason, why these comparisons were allowed in the first place, is sorting. – intgr Nov 24 '10 at 13:15
    • 1
      a fair point that it might have motivated it in the beginning. Nowadays you can use key= of course to sort heterogenous lists. – u0b34a0f6ae Nov 24 '10 at 13:32
    1

    (And who thought it was a good idea to allow this to begin with???)

    I can imagine that the reason might be to allow object from different types to be stored in tree-like structures, which use comparisons internally.

    dmeister
    • 34,704
    • 19
    • 73
    • 95
    • Correct me if I'm wrong, but AFAIK Python doesn't have any built-in datatypes that use trees internally. dict, set, OrderedDict, etc are all implemented as hash tables. – intgr Nov 24 '10 at 13:10
    • 1
      As Python allows heterogeneous collections, it might have seemed easier to allow sorting them too. While Python 3 simplifies the comparison rules, conceptually I think it complicates things to allow sorting of one instance of a list, but not another. – Ben James Nov 24 '10 at 13:14
    1

    As Aaron said. Breaking it up into your points:

    1. Because it makes sort do something halfway usable where it otherwise would make no sense at all (mixed lists). It's not a good idea generally, but much in Python is designed for convenience over strictness.
    2. Ordered by type name. This means things of the same type group together, where they can be sorted. They should probably be grouped by type class, such as numbers together, but there's no proper type class framework. There may be a few more specific rules in there (probably is one for numeric types), I'd have to check the source.
    3. One is string and the other is unicode. They may have a direct comparison operation, however, but it's conceivable a non-comparable type would get grouped between them, causing a mess. I don't know if there's code to avoid this.

    So, it doesn't make sense in the general case, but occasionally it's helpful.

    from random import shuffle
    letters=list('abcdefgh')
    ints=range(8)
    both=ints+letters
    shuffle(ints)
    shuffle(letters)
    shuffle(both)
    print sorted(ints+letters)
    print sorted(both)
    

    Both print the ints first, then the letters.

    As a rule, you don't want to mix types randomly within a program, and apparently Python 3 prevents it where Python 2 tries to make vague sense where none exists. You could still sort by lambda a,b: cmp(repr(a),repr(b)) (or something better) if you really want to, but it appears the language developers agreed it's impractical default behaviour. I expect it varies which gives the least surprise, but it's a lot harder to detect a problem in the Python 2 sense.

    Yann Vernier
    • 15,414
    • 2
    • 28
    • 26
    • Interesting vote shuffle on this one. It's true it didn't add much, but I'm amused it gets down and up votes at an equal rate. – Yann Vernier Nov 28 '10 at 19:38