0

Is there any way to subtract from an existing character class code? For example, let's say I want to get [a-zA-Z0-9], and the closest is \w, which expands to [a-zA-Z0-9_]. Is there a way to do something like:

  • [\w-_]

Or do we just have to re-write the class, i.e., [a-zA-Z0-9]

David542
  • 104,438
  • 178
  • 489
  • 842
  • 2
    `[^\W_]`. Or `(?!_)\w` – Wiktor Stribiżew Jan 02 '21 at 23:02
  • @WiktorStribiżew thanks. When writing a regex, do you prefer using `[a-zA-Z0-9]` or one of your suggestions? Or it doesn't matter for you – David542 Jan 02 '21 at 23:04
  • 1
    In Python, it matters a lot. If I need my regex to match Unicode letters or digits, I'd use `[^\W_]`. If I have to only match ASCII letters and digits, I'd use `[A-Za-z0-9]`. Actually, you are wrong saying "*`\w`... expands to `[a-zA-Z0-9_]`*". It expands to much more chars by default (if you do not use `(?a)` inline modifier or the `re.ASCII` flag). See [this answer of mine](https://stackoverflow.com/a/64794482/3832970) for more details. – Wiktor Stribiżew Jan 02 '21 at 23:06
  • @WiktorStribiżew that's great, thank you very much, especially for the `\w` comment. – David542 Jan 02 '21 at 23:08

0 Answers0