1

I just started toying around with regex. I've looked at Google's Python regex howto and Python's regex howto as well as other similar questions like Convert a string containing a roman numeral to integer equivalent and How do you match only valid roman numerals with a regular expression?, but I am still confused.

My code:

user = str(input("Input the Roman numeral: "))
characters = "I", "V" "X", "L", "C", "D", "M"
values = 1, 5, 10, 50, 100, 500, 1000

def numerals(match):
    return str(user(match.group(0)))

s = str(input("Input the Roman numeral: "))
regex = re.compile(r'\b(?=[MDCLXVI]+\b)M{0,4}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?    I{0,3})\b')
print regex.sub(numerals, s)

The last two lines are from the first link. I don't fully understand regex = re.compiler... and am wondering if it actually converts the user's roman numerals to integers? Thanks in advance

Community
  • 1
  • 1
AJ19
  • 83
  • 7

1 Answers1

1

There are some issues in your code. First, your regular expression is finding unecessary matches. When using parenthesis, use non matching expression (?: to avoid finding partial matches. The line

regex = re.compile(r'\b(?=[MDCLXVI]+\b)M{0,4}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})\b')

create a expression to find Roman numbers inside a text. This is only useful if you are going to use this expression frequently (like in a for loop). If you are going to use once, you can don't need to compile before using it. The following line requesting the user input again, because the function numerals calls the function user. So it requests the same user input twice. And finally, it tries to replace the first user input with the second one.

print regex.sub(numerals, s)

Converting from Romans to Decimals is a complex task and would probably require an algorithm. I make a few changes in your code just to point it in the right direction:

import re
text = input("Input the Roman numeral: ")
matches = re.findall(r'(?=\b[MDCLXVI]+\b)M{0,4}(?:CM|CD|D?C{0,3})(?:XC|XL|L?X{0,3})(?:IX|IV|V?I{0,3})', text)
for match in matches:
    print('Match: {}'.format(match))

Output:

Input a phrase with some Roman numerals: I LIVE IN III PLACES
Match: I
Match: III