1

I am trying to create a function where the words that are numbers are converted to digits. The function works for some test cases, however if I want to convert the word "done" to d1, and for Python to recognize that it is a digit, my test case is failing.

import re

NUM_WORDS = {
    'zero':  '0',
    'one':   '1',
    'two':   '2',
    'three': '3',
    'four':  '4',
    'five':  '5',
    'six':   '6',
    'seven': '7',
    'eight': '8',
    'nine':  '9',
}

def word_to_digit(sentence):
    for word, digit in NUM_WORDS.items():
        sentence = re.sub(f"\\b{word}\\b", digit, sentence)

    return sentence

print(word_to_digit('one one one'))
# 1 1 1"

print(word_to_digit('done'))      # "d1" This test case doesn't work

I thought that the word boundary anchor \b should work but it's not working as exepcted.

SrdjaNo1
  • 755
  • 3
  • 8
  • 18

1 Answers1

2

\b will ensure that "done" is not recognized, as it looks for word boundaries, i.e., only number words that are separated in your input string. It does exactly the opposite of what you want. If you want all words that are numbers to be spotted, remove the tag and it will work.

def word_to_digit(sentence):
for word, digit in NUM_WORDS.items():
    sentence = re.sub(word, digit, sentence)

return sentence

print(word_to_digit('one one one'))
# 1 1 1

print(word_to_digit('done'))
# d1
Patrick F
  • 144
  • 4