Subtract from a character class

Asked Jan 02 '21 at 23:00

Active Jan 02 '21 at 23:00

Viewed 18 times

Is there any way to subtract from an existing character class code? For example, let's say I want to get [a-zA-Z0-9], and the closest is \w, which expands to [a-zA-Z0-9_]. Is there a way to do something like:

[\w-_]

Or do we just have to re-write the class, i.e., [a-zA-Z0-9]

asked Jan 02 '21 at 23:00

David542

104,438
178
489
842

2

`[^\W_]`. Or `(?!_)\w` – Wiktor Stribiżew Jan 02 '21 at 23:02
@WiktorStribiżew thanks. When writing a regex, do you prefer using `[a-zA-Z0-9]` or one of your suggestions? Or it doesn't matter for you – David542 Jan 02 '21 at 23:04
1

In Python, it matters a lot. If I need my regex to match Unicode letters or digits, I'd use `[^\W_]`. If I have to only match ASCII letters and digits, I'd use `[A-Za-z0-9]`. Actually, you are wrong saying "*`\w`... expands to `[a-zA-Z0-9_]`*". It expands to much more chars by default (if you do not use `(?a)` inline modifier or the `re.ASCII` flag). See [this answer of mine](https://stackoverflow.com/a/64794482/3832970) for more details. – Wiktor Stribiżew Jan 02 '21 at 23:06
@WiktorStribiżew that's great, thank you very much, especially for the `\w` comment. – David542 Jan 02 '21 at 23:08

Subtract from a character class

0 Answers0