Note: I use >>>
to demonstrate actual python code and ?->
to demonstrate a hypothetical shell
There are a few possible scenarios for how this could be parsed implicitly:
3 < "two"
Python 2 defines an order to all objects so that a list of completely arbitrary objects can still be sorted, so all str
evaluate as more:
>>> 3 < "two" #this is the actual result in python 2
True
an alternate method is to convert the number to the equivalent word string, but then it compares them alphabetically:
>>> "three" < "two"
True
A third way is to attempt to parse the string into a number but since there are so many different notations and languages that the number can be written in it is near impossible to get it every time (see this question)
lets say we implemented this into python for english
?-> 1 < "two"
True
?-> 1 < "cent" #100 in french
(Traceback)
...
This is not much good since there are many programmers that may not speak english and implementing a parser for number words into the language for every possible language is quite impossible and especially confusing if 1000 > "cent"
evaluated as true when you are using the english word "cent".
Now lets pretend that we have implemented the above mentioned parser for english and the developers decide to discriminate against all languages other then english, how would strings compare to each other?
If the behaviour of comparing strings in python was unchanged it would create huge inconsistencies in comparisons:
>>> "five" < "three"
True
?-> "three" == 3
True
?-> "five" > 3 == "three"
True
?-> "five" < "three" == 3
True
?-> "ONE" == "one"
False
?-> "one" == 1 == "ONE"
True
Or lets say you tried to convert both strings to numbers and compared them as numbers, well then sorting strings will break:
?-> words = "please give me five apples".split()
?-> words.sort()
?-> words
['five', 'apples', 'give', 'me', 'please']
so basically any way you look at it adding this functionality implicitly would completely break a lot of other perfectly good functionality.
Edit
I was curious how the sorting would actually work so I created a class that actually does this kind of comparison:
from functools import total_ordering
@total_ordering
class number_word:
key = dict(enumerate(
("zero one two three four five six seven eight nine ten".split())
))
key.update({v:k for k,v in key.items()})
def __init__(self,value):
alt = self.key.get(value,None)
if isinstance(value,str):
self.word = value
self.num = alt
elif isinstance(value,int):
self.num = value
self.word = alt
else:
raise TypeError("must be str or int")
def __repr__(self):
return "nw(%r)"%self.word
def __eq__(self,other):
if not isinstance(other,number_word):
other = word_number(other)
if self.num == None == other.num:
#neither are valid numbers, compare as strings
return self.word == other.word
else:
return self.num == other.num
def __lt__(self,other):
if not isinstance(other,number_word):
other = word_number(other)
if self.num is None or other.num is None:
return self.word < other.word
else:
return self.num < other.num
so that number_word(2) < number_word("five")
will evaluate as true, take a look at the sorting of strings:
words = "range(1,6) goes from one to five".split()
correct = sorted(words)
num_sort = sorted(words,key=number_word)
backward = sorted(words,key=number_word, reverse=True)
print(correct)
print(num_sort)
print(backward[::-1])
theoretically all three should be the same, especially num_sort == backward[::-1]
but this is the result:
['five', 'from', 'goes', 'one', 'range(1,6)', 'to']
['from', 'goes', 'one', 'five', 'range(1,6)', 'to']
['one', 'five', 'from', 'goes', 'range(1,6)', 'to']
So yes it does break string sorting.