Regular expression for a string that does not start with a sequence

Question

I'm processing a bunch of tables using this program, but I need to ignore ones that start with the label "tbd_".

So far I have something like [^tbd_], but that simply not match those characters.

How does SchemaSpy work? Are you passing it a list of table names or are you passing it a regex and it's doing the matching? — Mark Biek, May 22 '09 at 18:57
I'm passing a regex (it's the -i flag) and it'll import the matches, or so it says in any case =) — echoblaze, May 22 '09 at 19:10
@echoblaze: If you’re processing XML, why don’t you use an XML parser? That would be much easier than using regular expressions. — Gumbo, May 22 '09 at 19:25

score 455 · Accepted Answer · edited Apr 11 '21 at 15:28

455

You could use a negative look-ahead assertion:

^(?!tbd_).+

Or a negative look-behind assertion:

(^.{1,3}$|^.{4}(?<!tbd_).*)

Or just plain old character sets and alternations:

^([^t]|t($|[^b]|b($|[^d]|d($|[^_])))).*

edited Apr 11 '21 at 15:28

Nuno André

4,739
1
33
46

answered May 22 '09 at 18:57

Gumbo

643,351
109
780
844

10

Is this restricted to any particular regex engines? – Mark Biek May 22 '09 at 18:59
1

I only ask because that second one still seems to match tbd_ in my test. The first one is great though. – Mark Biek May 22 '09 at 19:01
6

Take a look at regular-expressions.info’s flavor comparison: http://www.regular-expressions.info/refflavors.html – Gumbo May 22 '09 at 19:02
1

@Gumbo - should that not end .* instead of .+? A string that is tbd_ also starts with that... therefore by definition doesn't need to be followed by any other characters? Otherwise, good example. It *does* require a regex engine that supports lookaround though. – BenAlabaster May 22 '09 at 19:04
1

@balabaster: I don’t think he’s looking for empty strings. But if so, he can easily change that by replacing the `.+` by `.*` – Gumbo May 22 '09 at 19:07
Not looking for empty strings, thanks for the help! For some reason it's not working with my regex checker: http://www.bastian-bergerhoff.com/eclipse/features/web/QuickREx/toc.html but I'll finish the xml script and see how that goes – echoblaze May 22 '09 at 19:16
A little typo: the second one is a negative look-behind assertion. – PhiLho May 22 '09 at 19:32
I am having issues with the first example using egrep, python's re, as well as a few online regex parsers (i.e. [Gskinner](http://gskinner.com/RegExr/)). It looks like these don't like a look-ahead with nothing preceeding. Anyone else seeing this issue? – michaelxor Oct 22 '12 at 19:12
Actually, after playing around a little more it looks like egrep is the only one that really doesn't like example #1. I am able to make this work with python's re or Gskinner as long as the look-ahead is not the ONLY matchable string thing in the pattern (i.e. this does not work: '^(!?somestring)', but this does: '^(?!somestring).+'). – michaelxor Oct 22 '12 at 20:26
This works in Visual Studio 2015 Find in Files. This expression uses both a negative look-ahead and a negative look-behind to find all C++ identifiers that start with "do_", but do not start with "do_Mode" and do not end with "Desc" or "Msg": "do_(?!Mode)(_\w+|[\w-[0-9_]]\w*(?<!Desc|Msg))\b" – Scott Hutchinson May 11 '17 at 22:17
1

"Use a negative look-ahead assertion", I am sure I've been on the receiving end of a few of those. – Davos Aug 14 '19 at 01:19
Becareful with look ahead/behind its not supported by all browsers! This bite me in the butt – Colin Rosati Oct 22 '22 at 13:35
Thank you! The top one was the solution to my issue. I was looking for all lines that don't start with "clamp" in a Notepad++ regex `^(?!clamp).*$` saved the day. – Eric Hepperle - CodeSlayer2010 May 25 '23 at 14:13

Regular expression for a string that does not start with a sequence

1 Answers1

Linked

Related