I have this string: azjf8ee7Ldoge \n Hmeqze= AZ12D Fs \nsdfz14eZe148r
.
I want to match all lower case characters except when it is an e followed by a digit (e\d
) or when it is a backslash followed by n (\\n
).
Based on the answers I found here:
How to negate specific word in regex?
Match everything except for specified strings
I managed to find a solution: (?!(e\d|\\n))[a-z]
which works well, except that it matches the n
that comes after a backslash.
Link for a demo
How to exclude matching an n
preceded by a backslash?
Asked
Active
Viewed 514 times
3

Wiktor Stribiżew
- 607,720
- 39
- 448
- 563

singrium
- 2,746
- 5
- 32
- 45
-
1`re.findall(r'e\d|\\n|([a-z])', text)`? Or are you replacing? Like `re.sub(r'(e\d|\\n)|[a-z]', r'\1', text)` ([demo](https://regex101.com/r/cU68PW/1))? – Wiktor Stribiżew Nov 18 '19 at 15:13
-
@WiktorStribiżew, I am replacing in fact.. – singrium Nov 18 '19 at 15:13
-
1Like `re.sub(r'(e\d|\\n)|[a-z]', r'\1', text)` ([demo](https://regex101.com/r/cU68PW/1))? – Wiktor Stribiżew Nov 18 '19 at 15:14
-
@WiktorStribiżew, thank you, that works. Please post it as an answer :-) – singrium Nov 18 '19 at 15:17
-
1Sorry, I added another lookaround based solution following my logic to my answer. – Wiktor Stribiżew Nov 18 '19 at 15:27
3 Answers
3
To keep any e
with a single digit after and \n
two-char sequences, and remove any lowercase ASCII letter in other contexts you may use
re.sub(r'(e\d|\\n)|[a-z]', r'\1', text)
See the regex demo
Details
(e\d|\\n)
- matches and captures into Group 1 (referred to with\1
placeholder from the replacement pattern) ane
and a single digit or a\
and ann
char|
- or[a-z]
- a lowercase ASCII letter.
The \1
restores the captured values in the result.
If you want to play with lookarounds you may use
[a-z](?<!e(?=\d))(?<!\\n)
re.sub(r'[a-z](?<!e(?=\d))(?<!\\n)', '', text)
The [a-z](?<!e(?=\d))(?<!\\n)
pattern matches any ASCII lowercase letter ([a-z]
) that is not e
followed with a digit ((?<!e(?=\d))
) and is not n
preceded with n
((?<!\\n)
).

Wiktor Stribiżew
- 607,720
- 39
- 448
- 563
2
If you want to avoid matching \n
then you may add a negative lookahead assertion in your regex:
(?!e\d|\\n)[a-z](?<!\\n)
(?<!\\n)
is negative lookbehind assertion that ensures that we don't have \n
at previous position after matching [a-z]
within your match.

anubhava
- 761,203
- 64
- 569
- 643
-
I think the right answer should be: `(?!(e\d|\\n))[a-z](?<!\\n)` because the answer you proposed still matches **e followed by a digit** – singrium Nov 18 '19 at 15:46
-
1
You could match char a-z and make use of lookarounds:
(?!e\d)[a-z](?<!\\[a-z])
In parts
(?!e\d)
Negative lookahead, assert what is on the right is note
followed by a digit[a-z]
Match a char a-z(?<!\\[a-z])
Negative lookbehind, assert what is on the left is not\
followed by a char a-z

The fourth bird
- 154,723
- 16
- 55
- 70