452

I have a regular expression as follows:

^/[a-z0-9]+$

This matches strings such as /hello or /hello123.

However, I would like it to exclude a couple of string values such as /ignoreme and /ignoreme2.

I've tried a few variants but can't seem to get any to work!

My latest feeble attempt was

^/(((?!ignoreme)|(?!ignoreme2))[a-z0-9])+$
Francesco - FL
  • 603
  • 1
  • 4
  • 25
romiem
  • 8,360
  • 7
  • 29
  • 36

8 Answers8

539

Here's yet another way (using a negative look-ahead):

^/(?!ignoreme|ignoreme2|ignoremeN)([a-z0-9]+)$ 

Note: There's only one capturing expression: ([a-z0-9]+).

Seth
  • 45,033
  • 10
  • 85
  • 120
  • 1
    Brilliant, that seems to have done the trick. I actually need this rule for url rewriting and I wanted to ignore the "images", "css" and "js" folder. So my rule is as follows: ^/(?!css|js|images)([a-z]+)/?(\?(.+))?$ and it rewrites to /Profile.aspx?id=$1&$3 Will this rule work correctly and propagate the query string too? So if someone visits http://mydomain.com/hello?abc=123 I'd like it to rewrite to http://mydomain.com/Profile.aspx?id=hello&abc=123 I'm also a bit unsure about the performance of (.+) at the end to capture the querystring in the original request. – romiem Jan 16 '10 at 21:32
  • Sounds like this is another question. The regexp that you have looks like it will capture the query string -- test and see if your query string comes along. Also - `(\?(.+))?$` should be fast. I wouldn't worry too much about speed. – Seth Jan 17 '10 at 20:25
  • 2
    This didn't work for me, while Alix Axel's solution did work. I'm using Java's `java.util.regex.Pattern` class. – Mark Jeronimus Jun 20 '13 at 18:27
  • 2
    I confirm Mark's reMark ;) - for example, Pycharm is Java-based, isn't it? So, considering regexes in Pycharm search Alix's solution works, the other does not. – fanny Sep 16 '16 at 14:13
  • Best answer. Worked even in the regexp_replace function in MySQL. – Abdullah Khawer Jul 08 '22 at 19:07
  • I get "pattern error" https://regex101.com/r/g5m7tZ/1 – Black Dec 13 '22 at 12:35
  • 1
    @Black, that site has a setting for treating `/` as a delimiter (though I don't know what it's delimiting, and am not sure what that feature is for). Here, the `/` is part of the expression, so the site is getting confused. Try setting the delimiter on the site to something else. Then try a test string that the pattern will match, like `/hello`. – Seth Dec 15 '22 at 02:04
58

This should do it:

^/\b([a-z0-9]+)\b(?<!ignoreme|ignoreme2|ignoreme3)

You can add as many ignored words as you like. Here is a simple PHP implementation:

$ignoredWords = array('ignoreme', 'ignoreme2', 'ignoreme...');

preg_match('~^/\b([a-z0-9]+)\b(?<!' . implode('|', array_map('preg_quote', $ignoredWords)) . ')~i', $string);
JSON
  • 35
  • 8
Alix Axel
  • 151,645
  • 95
  • 393
  • 500
  • i thought look-behind requires a fixed-width pattern? – simon Sep 16 '13 at 09:14
  • 4
    @AlixAxel It does, but smarter regex libs will allow an alternation with varying lengths for the alternatives (and use the longest), as long as each alternative is of fixed length. – ChrisF Dec 22 '14 at 03:33
  • this is smart, but fails for me if the ignored word is on the end of any other word. i.e. if you add 'a' as one off the ignored words, then any word that ends in a is ignored – singmotor Apr 10 '17 at 15:19
27

As you want to exclude both words, you need a conjuction:

^/(?!ignoreme$)(?!ignoreme2$)[a-z0-9]+$

Now both conditions must be true (neither ignoreme nor ignoreme2 is allowed) to have a match.

Gumbo
  • 643,351
  • 109
  • 780
  • 844
  • 1
    This is equivalent to the shorter one above that is a negative lookahead of a set of alternatives. – ChrisF Dec 22 '14 at 03:32
  • 5
    @ChrisF No, not really. Seth’s solution would not match something like `/ignoremenot` as the `/` is followed by `ignoreme`. – Gumbo Dec 22 '14 at 07:16
26

This excludes all rows containing ignoreme from search results. It will also work pretty well when there are any character in a row

^((?!ignoreme).)*$
  • When I enable the multiline flag, the caret (`^`) will match beginning of the line instead of being an exclude character. Any way to maintain the exclude ability while multiline is enabled? – Alaa M. Oct 25 '22 at 13:05
4

This worked for me: ^((?!\ignoreme1\b)(?!\ignoreme2\b)(?!\ignoreme3\b).)*$

  • Thank you! This worked for me in HomeAssistant's `entity-filter` card to hide entities that had specific state attributes. – MayeulC Mar 21 '23 at 16:49
  • To add: though I think ``\`` before `ignore` is useless at best. And it's weird that you used `\b` after but not before: this will exclude `abcignoreme1` but not `ignoreme1abc`. – MayeulC Mar 21 '23 at 16:55
1

This worked for me in python 3.x for a Machine Learning pipeline make_column_selector for including and excluding certain columns from a dataframe. to exclude ^(?!(col2|co4|col6)).*$

categoral_selector = make_column_selector(pattern = "(col2|co4|col6)")
numeric_selector = make_column_selector(pattern = "^(?!(col2|co4|col6)).*$")
user2557522
  • 131
  • 1
  • 6
0

This works :

(Request(?!\.Cookies|.Form\b))+

Will match any occurrence of Request when NOT followed by .Cookies or .Form

So :

  • Will match Request(
  • Will match Request.Querystring
  • Won't match Request.Form
  • Won't match Request.Cookies

Detailed explanation of this Regex can be found when typed at https://www.regextester.com/

AlexLaforge
  • 499
  • 3
  • 19
-1

simpler:

re.findall(r'/(?!ignoreme)(\w+)',  "/hello /ignoreme and /ignoreme2 /ignoreme2M.")

you will get:

['hello']
pabloverd
  • 614
  • 8
  • 8