Word to digit function not working as expected using regex

Question

I am trying to create a function where the words that are numbers are converted to digits. The function works for some test cases, however if I want to convert the word "done" to d1, and for Python to recognize that it is a digit, my test case is failing.

import re

NUM_WORDS = {
    'zero':  '0',
    'one':   '1',
    'two':   '2',
    'three': '3',
    'four':  '4',
    'five':  '5',
    'six':   '6',
    'seven': '7',
    'eight': '8',
    'nine':  '9',
}

def word_to_digit(sentence):
    for word, digit in NUM_WORDS.items():
        sentence = re.sub(f"\\b{word}\\b", digit, sentence)

    return sentence

print(word_to_digit('one one one'))
# 1 1 1"

print(word_to_digit('done'))      # "d1" This test case doesn't work

I thought that the word boundary anchor \b should work but it's not working as exepcted.

I understand it is not a whole word. Would it be possible to use a certain regex expression to make this work? — SrdjaNo1, Aug 30 '23 at 12:26
The question is not aswared into another, each quetion is unique why people close withou advisor the questions? — Franz Kurt, Sep 01 '23 at 13:40

score 2 · Answer 1 · answered Aug 30 '23 at 12:25

\b will ensure that "done" is not recognized, as it looks for word boundaries, i.e., only number words that are separated in your input string. It does exactly the opposite of what you want. If you want all words that are numbers to be spotted, remove the tag and it will work.

def word_to_digit(sentence):
for word, digit in NUM_WORDS.items():
    sentence = re.sub(word, digit, sentence)

return sentence

print(word_to_digit('one one one'))
# 1 1 1

print(word_to_digit('done'))
# d1

Word to digit function not working as expected using regex

1 Answers1