130

*, ?, + characters all mean match this character. Which character means 'don't' match this? Examples would help.

null
  • 889
  • 1
  • 13
  • 24
Ali
  • 261,656
  • 265
  • 575
  • 769
  • For solution relating to not matching a word - See here http://stackoverflow.com/questions/406230 – null Jul 07 '15 at 06:26

4 Answers4

114

You can use negated character classes to exclude certain characters: for example [^abcde] will match anything but a,b,c,d,e characters.

Instead of specifying all the characters literally, you can use shorthands inside character classes: [\w] (lowercase) will match any "word character" (letter, numbers and underscore), [\W] (uppercase) will match anything but word characters; similarly, [\d] will match the 0-9 digits while [\D] matches anything but the 0-9 digits, and so on.

If you use PHP you can take a look at the regex character classes documentation.

Paolo Stefan
  • 10,112
  • 5
  • 45
  • 64
  • 1
    Any way to do this without using negated character classes? I dislike having to use an entire class when it is just for one character. – Cole Feb 22 '21 at 08:03
  • 1
    @Cole - you've probably long-since figured this out, but for the next person, to negate a single letter, you would just use that in place of the character class. e.g. `[^S]` would match anything that isn't an uppercase `S`. It's clumsy, but there it is. – ThatBlairGuy Jun 16 '22 at 15:27
113

There's two ways to say "don't match": character ranges, and zero-width negative lookahead/lookbehind.

The former: don't match a, b, c or 0: [^a-c0]

The latter: match any three-letter string except foo and bar:

(?!foo|bar).{3}

or

.{3}(?<!foo|bar)

Also, a correction for you: *, ? and + do not actually match anything. They are repetition operators, and always follow a matching operator. Thus, a+ means match one or more of a, [a-c0]+ means match one or more of a, b, c or 0, while [^a-c0]+ would match one or more of anything that wasn't a, b, c or 0.

Amadan
  • 191,408
  • 23
  • 240
  • 301
67

[^] ( within [ ] ) is negation in regular expression whereas ^ is "begining of string"

[^a-z] matches any single character that is not from "a" to "z"

^[a-z] means string starts with from "a" to "z"

Reference

xkeshav
  • 53,360
  • 44
  • 177
  • 245
11

^ used at the beginning of a character range, or negative lookahead/lookbehind assertions.

>>> re.match('[^f]', 'foo')
>>> re.match('[^f]', 'bar')
<_sre.SRE_Match object at 0x7f8b102ad6b0>
>>> re.match('(?!foo)...', 'foo')
>>> re.match('(?!foo)...', 'bar')
<_sre.SRE_Match object at 0x7f8b0fe70780>
Ignacio Vazquez-Abrams
  • 776,304
  • 153
  • 1,341
  • 1,358
  • Do you have to use `?!` in the last 2 examples or can you just use `!` by itself? What does `?` do there? – Ali May 08 '11 at 05:19
  • Python needs the `?` in order to tell that it's an extension. Other regex engines may have their own rules. – Ignacio Vazquez-Abrams May 08 '11 at 05:21
  • @Click: It's pretty standard. http://www.regular-expressions.info/refadv.html, also most regexp engine manuals say the same thing. – Amadan May 08 '11 at 05:25