Replace specific substring in match python

Question

I was to use regex to replace a substring of a matched string in a df series. I have looked through the documentation (e.g. HERE ) and I have found a solution that is able to capture the specific type of string that I want to match. However, during the replace, it does not replace the substring.

I have cases such as

data
initthe problem
nationthe airline
radicthe groups
professionthe experience
the cat in the hat

In this particular case, I am interested in substituting "the" with "al" in those cases where "the" is not a standalone string (i.e. preceeded and followed by whitespaces).

I have tried the following solution:

patt = re.compile(r'(?:[a-z])(the)')
df['data'].str.replace(patt, r'al')

However, it also replaces the non-whitespace character preceding the "the".

Any suggestions on how what I can do to just repalce those specific cases of a substring?

But `inithe` will turn into `inial`, I guess you need `initial`? Even if you fix it to `df['data'].str.replace(r'(?<=[a-z])the', r'al')` — Wiktor Stribiżew, Oct 08 '18 at 10:34

Tim Biegeleisen · Accepted Answer · 2018-10-08T10:37:16.023

1

Try using a lookbehind, which checks (asserts) for a character before the, but does not actually consume anything:

input = "data\ninitthe problem\nnationthe airline\nradicthe groups\nprofessionthe experience\nthe cat in the hat"

output = re.sub(r'(?<=[a-z])the', 'al', input)
print(output)

data
inital problem
national airline
radical groups
professional experience
the cat in the hat

Demo

edited Oct 08 '18 at 10:37

answered Oct 08 '18 at 10:34

Tim Biegeleisen

502,043
27
286
360

1

Though it is what OP tries to use, the result will probably not be "final" since `inithe` will turn into `inial`. – Wiktor Stribiżew Oct 08 '18 at 10:36
1

@WiktorStribiżew I interpret this as bad sample data, not a bad regex solution. – Tim Biegeleisen Oct 08 '18 at 10:36
Well, another dupe anyway. – Wiktor Stribiżew Oct 08 '18 at 10:36
Yes, sorry. There was an error in the simple data I updated it. – owwoow14 Oct 08 '18 at 10:37

Replace specific substring in match python

1 Answers1

Demo