I am trying to clean some data including texts like '6cm*8cm', '6cmx8cm', and '6*8'. I want to modify them so that they become similar. Notice that numbers are changeable, so the data may have '3cm*4cm' etc.
# input strings
strings = [
"6cm*8cm",
"12mmx15mm",
'Device stemmer 2mm*8mm',
'Device stemming 2mmx8mm'
]
# My desired output would be:
desired_strings = [
'6*8',
'12*15',
'Device stemmer 2*8',
'Device stemming 2*8'
]
I am using python's 're'. My preference is to convert them to a simple '6*8' (i.e, number*number). Note that in some of the entries data has strings like: 'Device stemmer 2mm*8mm', and I do not want to change other words.
Is there a pythonic way with regex to modify all the possible combinations of numbers and units paired with each other?