I was to use regex to replace a substring of a matched string in a df series. I have looked through the documentation (e.g. HERE ) and I have found a solution that is able to capture the specific type of string that I want to match. However, during the replace, it does not replace the substring.
I have cases such as
data
initthe problem
nationthe airline
radicthe groups
professionthe experience
the cat in the hat
In this particular case, I am interested in substituting "the" with "al" in those cases where "the" is not a standalone string (i.e. preceeded and followed by whitespaces).
I have tried the following solution:
patt = re.compile(r'(?:[a-z])(the)')
df['data'].str.replace(patt, r'al')
However, it also replaces the non-whitespace character preceding the "the".
Any suggestions on how what I can do to just repalce those specific cases of a substring?