Why can't Python understand integers in word form?

Question

I wasn't able to find an answer for my question anywhere, and I am still fairly new to Python. This question's aim is to mainly find out how Python works and what it is limited to. This answer provides a module that will convert a number from an integer to the integer's word form. However, if I wanted to run a code such as this, without any modules that work in the opposite way of the module in the link,

a = "five"
b = 2
if b < a:
    print("a is higher than b")

I receive the TypeError: unorderable types: int() < str().

So, why won't Python recognize this string as a number in the word form? Is it something along the lines of "Python has not been built to recognize word forms of numbers"?

Yes: *"Python has not been built to recognize word forms of numbers"* — Alex K., Mar 24 '16 at 12:18
[The Zen of Python](https://www.python.org/dev/peps/pep-0020/): Special cases aren't special enough to break the rules. In the face of ambiguity, refuse the temptation to guess. There should be one-- and preferably only one --obvious way to do it. Explicit is better than implicit. — Tadhg McDonald-Jensen, Mar 24 '16 at 14:02

score 1 · Accepted Answer · edited May 23 '17 at 10:28

Note: I use >>> to demonstrate actual python code and ?-> to demonstrate a hypothetical shell

There are a few possible scenarios for how this could be parsed implicitly:

3 < "two"

Python 2 defines an order to all objects so that a list of completely arbitrary objects can still be sorted, so all str evaluate as more:

 >>> 3 < "two" #this is the actual result in python 2
 True

an alternate method is to convert the number to the equivalent word string, but then it compares them alphabetically:

 >>> "three" < "two"
 True

A third way is to attempt to parse the string into a number but since there are so many different notations and languages that the number can be written in it is near impossible to get it every time (see this question)

lets say we implemented this into python for english

?-> 1 < "two"
True
?-> 1 < "cent" #100 in french
(Traceback)
   ...

This is not much good since there are many programmers that may not speak english and implementing a parser for number words into the language for every possible language is quite impossible and especially confusing if 1000 > "cent" evaluated as true when you are using the english word "cent".

Now lets pretend that we have implemented the above mentioned parser for english and the developers decide to discriminate against all languages other then english, how would strings compare to each other?

If the behaviour of comparing strings in python was unchanged it would create huge inconsistencies in comparisons:

>>> "five" < "three"
True
?-> "three" == 3
True
?-> "five" > 3 == "three"
True
?-> "five" < "three" == 3
True
?-> "ONE" == "one"
False
?-> "one" == 1 == "ONE"
True

Or lets say you tried to convert both strings to numbers and compared them as numbers, well then sorting strings will break:

?-> words = "please give me five apples".split()
?-> words.sort()
?-> words
['five', 'apples', 'give', 'me', 'please']

so basically any way you look at it adding this functionality implicitly would completely break a lot of other perfectly good functionality.

Edit

I was curious how the sorting would actually work so I created a class that actually does this kind of comparison:

from functools import total_ordering

@total_ordering
class number_word:
    key = dict(enumerate(
               ("zero one two three four five six seven eight nine ten".split())
              ))
    key.update({v:k for k,v in key.items()})

    def __init__(self,value):
        alt = self.key.get(value,None)
        if isinstance(value,str):
            self.word = value
            self.num = alt

        elif isinstance(value,int):
            self.num = value
            self.word = alt
        else:
            raise TypeError("must be str or int")

    def __repr__(self):
        return "nw(%r)"%self.word

    def __eq__(self,other):
        if not isinstance(other,number_word):
            other = word_number(other)

        if self.num == None == other.num:
            #neither are valid numbers, compare as strings
            return self.word == other.word
        else:
            return self.num == other.num

    def __lt__(self,other):
        if not isinstance(other,number_word):
            other = word_number(other)

        if self.num is None or other.num is None:
            return self.word < other.word
        else:
            return self.num < other.num

so that number_word(2) < number_word("five") will evaluate as true, take a look at the sorting of strings:

words = "range(1,6) goes from  one to five".split()
correct = sorted(words)
num_sort = sorted(words,key=number_word)
backward = sorted(words,key=number_word, reverse=True)

print(correct)
print(num_sort)
print(backward[::-1])

theoretically all three should be the same, especially num_sort == backward[::-1] but this is the result:

['five', 'from', 'goes', 'one', 'range(1,6)', 'to']
['from', 'goes', 'one', 'five', 'range(1,6)', 'to']
['one', 'five', 'from', 'goes', 'range(1,6)', 'to']

So yes it does break string sorting.

Why can't Python understand integers in word form?

1 Answers1

so basically any way you look at it adding this functionality implicitly would completely break a lot of other perfectly good functionality.