5
>>> "spam" < "bacon"
False
>>> "spam" < "SPAM"
False
>>> "spam" < "spamalot"
True
>>> "Spam" < "eggs"
True

How are equal length strings compared? Why is "Spam" less than "eggs"? What if the strings are not the same length?

Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
user1827019
  • 51
  • 1
  • 2

4 Answers4

3

Lexigraphically.

The first bytes are compared and if the ordinal value of the first is less than of the second, it's lesser. If it's more, it's greater. If they are the same, the next byte is tried. If it's all ties and one is longer, the shorter one is lesser.

>>> "a" < "zzz"
True
>>> "aaa" < "z"
True
>>> "b" < "a"
False
>>> "abc" < "abcd"
True
>>> "abcd" < "abce"
True
>>> "A" < "a"
True
>>> ord("A")
65
>>> ord("a")
97
Mike Graham
  • 73,987
  • 14
  • 101
  • 130
2

Since A comes before a in ASCII table, so S in Spam is considered smaller than e in eggs.

>>> "A" < "a"
True
>>> "S" < "e"
True
>>> "S" < "eggs"
True

Note that, String length in not considered in comparison. Rather ordinal values for each byte are compared starting from the first byte, as rightly pointed out by @MikeGraham in comments below. And as soon as mismatch is found, the comparison stops, and comparison value is returned, as in the last example.

From the docs - Comparing Sequences and Other Types: -

The comparison uses lexicographical ordering: first the first two items are compared, and if they differ this determines the outcome of the comparison; if they are equal, the next two items are compared, and so on, until either sequence is exhausted.

Also further in the same paragraph: -

Lexicographical ordering for strings uses the ASCII ordering for individual characters

Rohit Jain
  • 209,639
  • 45
  • 409
  • 525
  • 1
    Good point, but presumably it's not limited to the ASCII encoding. – Brian Cain Nov 15 '12 at 15:01
  • It depends entirely on the byte values in the str object (it's not inherently textual—it's just bytes). Python does not depend on any encoding for the `<` operation, though you might have used some encoding to make and/or your str objects. – Mike Graham Nov 15 '12 at 15:12
  • @MikeGraham.. Well, you are right. Didn't go into that precision earlier. Updated answer now. thanks :) – Rohit Jain Nov 15 '12 at 15:21
2

Strings in Python are lexicographically ordered so that they can be logically sorted:

>>> print sorted(['spam','bacon','SPAM','spamalot','Spam','eggs'])
['SPAM', 'Spam', 'bacon', 'eggs', 'spam', 'spamalot']

There are compromises with this, primarily with unicode. The letter é will be sorted after the letter z for example:

>>> 'e' < 'z'
True
>>> 'é' < 'z'
False

Luckily, you can use a sort function, use locale or a subclass of string to have strings sorted anyway you wish.

Community
  • 1
  • 1
1

It is a lexicographical comparison.

DonCallisto
  • 29,419
  • 9
  • 72
  • 100