What does this pattern (?<=\w)\W+(?=\w) mean in a Python regular expression?

Question

What does this pattern (?<=\w)\W+(?=\w) mean in a Python regular expression?

#l is a list 
print(re.sub("(?<=\w)\W+(?=\w)", " ", l))

Are you *sure* `l` is a list? If so, that call can't possibly work. The third argument to `re.sub` is supposed to be a single string, which is what the substitution works on. — Blckknght, Aug 08 '21 at 00:36

sj95126 · Accepted Answer · 2021-08-08T00:41:02.207

4

Here's a breakdown of the elements:

\w means an alphanumeric character
\W+ is the opposite of \w; with the + it means one or more non-alphanumeric characters
?<= is called a "lookbehind assertion"
?= is a "lookahead assertion"

So this re.sub statement means "if there are one or more non-alphanumeric characters with an alphanumeric character before and after, replace the non-alphanumeric character(s) with a space".

And by the way, the third argument to re.sub must be a string (or bytes-like object); it can't be a list.

edited Aug 08 '21 at 00:41

answered Aug 08 '21 at 00:38

sj95126

6,520
2
15
34

thank you very much for your explanation. yes, it is a string, I was wrong. thanks – Deyaa Aug 08 '21 at 13:20

MDR · Answer 2 · 2021-08-08T00:45:44.590

Just put it into a site like regex101.com and hover the cursor over the parts.

https://regex101.com/r/JtrWIw/1

It would match non-word chars between word chars. Bits between the last 'd' of 'word' and the first 'w' of 'word' from the string below as an example...

word^&*((*&^%$%^&*& ^%$£%^&**&^%$£!"£$%^&*()word

Example:

import re

#if it is a list...
l = ['John Smith', 'This%^&*(string', 'Never!£$Mind^&*I$?/Solved{}][]It']

#l is a list 
print(re.sub(r"(?<=\w)\W+(?=\w)", " ", l[2]))

Never Mind I Solved It

What does this pattern (?<=\w)\W+(?=\w) mean in a Python regular expression?

2 Answers2