I'm working with long strings and I need to replace with ''
all the combinations of adjacent full stops .
and/or colons :
, but only when they are not adjacent to any whitespace. Examples:
a.bcd
should giveabcd
a..::.:::.:bcde.....:fg
should giveabcdefg
a.b.c.d.e.f.g.h
should giveabcdefgh
a .b
should givea .b
, because.
here is adjacent to a whitespace on its left, so it has not to be replaceda..::.:::.:bcde.. ...:fg
should giveabcde.. ...:fg
for the same reason
Well, here is what I tried (without any success).
Attempt 1:
s1 = r'a.b.c.d.e.f.g.h'
re.sub(re.search(r'[^\s.:]+([.:]+)[^\s.:]+', s1).group(1), r'', s1)
I would expect to get 'abcdefgh'
but what I actually get is r''
. I understood why: the code
re.search(r'[^\s.:]+([.:]+)[^\s.:]+', s1).group(1)
returns '.'
instead of '\.'
, and thus re.search
doesn't understand that it has to replace the single full stop .
rather than understanding '.'
as the usual regex.
Attempt 2:
s1 = r'a.b.c.d.e.f.g.h'
re.sub(r'([^\s.:]*\S)[.:]+(\S[^\s.:]*)', r'\g<1>\g<2>', s1)
This doesn't work as it returns a.b.c.d.e.f.gh
.
Attempt 3:
s1 = r'a.b.c.d.e.f.g.h'
re.sub(r'([^\s.:]*)[.:]+([^\s.:]*)', r'\g<1>\g<2>', s1)
This works on s1
, but it doesn't solve my problem because on s2 = r'a .b'
it returns a b
rather than a .b
.
Any suggestion?