import pandas as pd
dataframe = pd.DataFrame({'Data' : ['The **ALI**1929 for 90 days but not 77731929 ',
'For all **ALI**1952 28A 177945 ',
'But the **ALI**1914 and **ALI**1903 1912',],
'ID': [1,2,3]
})
Data ID
0 The **ALI**1929 for 90 days but not 77731929 1
1 For all **ALI**1952 28A 177945 2
2 But the **ALI**1914 and **ALI**1903 1912 3
My dataframe looks like what I have above. My goal is to replace the word OLDER
with any number at or under 1929
that is associated with **ALI**
. So **ALI**1929
would be **ALI**OLDER
and ALI**1903
would also be **ALI**OLDER
but **ALI**1952
would remain the same. From How to extract certain length of numbers from a string in python? I have tried
dataframe['older'] = dataframe['Data'].str.replace(r'(?<!\d)(\d{3})(?!\d)', 'OLDER')
But this doesnt work too well for what I want. I would like something like this as output
Data ID older
0 The ALI**OLDER for 90 days but not 77731929
1 For all ALI**1952 28A 177945
2 But the ALI**OLDER and ALI**OLDER 1912
How do I change my regex str.replace(r'(?<!\d)(\d{3})(?!\d)'
to do so?