31

Ok, so this is something completely stupid but this is something I simply never learned to do and its a hassle.

How do I specify a string that does not contain a sequence of other characters. For example I want to match all lines that do NOT end in '.config'

I would think that I could just do

.*[^(\.config)]$

but this doesn't work (why not?)

I know I can do

.*[^\.][^c][^o][^n][^f][^i][^g]$

but please please please tell me that there is a better way

George Mauer
  • 117,483
  • 131
  • 382
  • 612

7 Answers7

55

You can use negative lookbehind, e.g.:

.*(?<!\.config)$

This matches all strings except those that end with ".config"

Manu
  • 28,753
  • 28
  • 75
  • 83
  • This works but .*(?!=\.config)$ does not - I thought the two syntaxes were equivalent. Any clue? – George Mauer Dec 28 '09 at 22:01
  • 2
    They are NOT equivalent. (?<!) matches the preceding string (look behind), while (?!) matches the following string (look ahead) – Manu Dec 28 '09 at 22:05
  • 1
    No, they are not. Negative lookahead is `(?!matchthis)`, and your example can't work because you're looking ahead at a moment when you're already at the end of the string (`$`). – Tim Pietzcker Dec 28 '09 at 22:07
  • 1
    Also, [why negating regex is "difficult"](http://www.perlmonks.org/?node_id=588315#588368). – Lazer May 23 '10 at 07:08
  • Actually, that link overstates the difficulty. Regex matching already produces a DFA from the regex, so the scary exponential expansion step referred to is already occurring every time you use the original regex. Once you've paid that price, it's straightforward to (1) complement the set of states that are considered Accepting states, and (2) declare success if you ever "fall off" the automaton by encountering a symbol in a state that has no transition for that symbol. – Doradus Dec 27 '14 at 03:37
36

Your question contains two questions, so here are a few answers.

Match lines that don't contain a certain string (say .config) at all:

^(?:(?!\.config).)*$\r?\n?

Match lines that don't end in a certain string:

^.*(?<!\.config)$\r?\n?

and, as a bonus: Match lines that don't start with a certain string:

^(?!\.config).*$\r?\n?

(each time including newline characters, if present.

Oh, and to answer why your version doesn't work: [^abc] means "any one (1) character except a, b, or c". Your other solution would also fail on test.hg (because it also ends in the letter g - your regex looks at each character individually instead of the entire .config string. That's why you need lookaround to handle this.

Tim Pietzcker
  • 328,213
  • 58
  • 503
  • 561
4
(?<!\.config)$

:)

watain
  • 4,838
  • 4
  • 34
  • 35
2

Unless you are "grepping" ... since you are not using the result of a match, why not search for the strings that do end in .config and skip them? In Python:

import re
isConfig = re.compile('\.config$')
# List lst is given
filteredList = [f.strip() for f in lst if not isConfig.match(f.strip())]

I suspect that this will run faster than a more complex re.

Hamish Grubijan
  • 10,562
  • 23
  • 99
  • 147
  • 1
    Unless you are grepping, why use regex at all? Python has `in` for a reason. Other languages I'm sure have similar solutions. – Instance Hunter Dec 28 '09 at 22:00
  • Yeah this is what I do now, but it is best to know how to do it both ways. I've run into situations where this has forced some awkward syntax. – George Mauer Dec 28 '09 at 22:03
2

By using the [^] construct, you have created a negated character class, which matches all characters except those you have named. Order of characters in the candidate match do not matter, so this will fail on any string that has any of [(\.config) (or [)gi.\onc(])

Use negative lookahead, (with perl regexs) like so: (?!\.config$). This will match all strings that do not match the literal ".config"

Andrew
  • 570
  • 4
  • 11
2

As you have asked for a "better way": I would try a "filtering" approach. I think it is quite easy to read and to understand:

#!/usr/bin/perl

while(<>) {
    next if /\.config$/; # ignore the line if it ends with ".config"
    print;
}

As you can see I have used perl code as an example. But I think you get the idea?

added: this approach could also be used to chain up more filter patterns and it still remains good readable and easy to understand,

    next if /\.config$/; # ignore the line if it ends with ".config"
    next if /\.ini$/;    # ignore the line if it ends with ".ini"
    next if /\.reg$/;    # ignore the line if it ends with ".reg"

    # now we have filtered out all the lines we want to skip
    ... process only the lines we want to use ...
0

I used Regexpal before finding this page and came up with the following solution when I wanted to check that a string doesn't contain a file extension:

^(.(?!\.[a-zA-Z0-9]{3,}))*$ I used the m checkbox option so that I could present many lines and see which of them did or did not match.

so to find a string that doesn't contain another "^(.(?!" + expression you don't want + "))*$"

My article on the uses of this particular regex

Maslow
  • 18,464
  • 20
  • 106
  • 193