1

I am parsing a feed and need to exclude fields that consist of a string with the word "bad", in any combination of case.

For example "bad" or "Bad id" or "user has bAd id" would not pass the regular expression test,

but "xxx Badlands ddd" or "aaabad" would pass.

user2864740
  • 60,010
  • 15
  • 145
  • 220
Brian
  • 11
  • 1
  • 2
    probably useful: http://stackoverflow.com/questions/406230/regular-expression-to-match-string-not-containing-a-word?rq=1 – Marc B Aug 22 '14 at 18:00
  • What delimits the word "bad" ? You'll probably see a lot of `\bbad\b` but that's not really correct. A regex 'word' is not really a language word. –  Aug 22 '14 at 18:01
  • 1
    This question is somewhat interesting on two grounds (although I'm not sure it warrants being a non-duplicate of the linked question): 1) It asks for a *negative/inverted* result on the test ("would not pass"), 2) It asks only for finding "bad" as a whole/distinct word (such that "bad" and "badlands" yield different results). – user2864740 Aug 22 '14 at 18:12
  • 1
    In what programming language are you using regular expressions? What have you tried already? Please elaborate in your question. – Ruud Helderman Aug 22 '14 at 18:15
  • @Brian Regular expressions are generally better (easier to understand) when written in the "would pass/match" case - is it absolutely vital that the negated logic is *inside* the regular expression, or could `!match(re)` (to invert the result) simply be used in the programming language? – user2864740 Aug 22 '14 at 18:17
  • I am voting to close as a duplicate. I changed the example in the answer to the accepted answer of the linked question to `^((?!\bbad\b).)*$` (where `\bbad\b` is trivially "word-boundary,b,a,d,word-boundary") and it works per the rules in the question. Make sure to use the `/i` flag. – user2864740 Aug 22 '14 at 18:48

3 Answers3

0

For javascript, you can just put your word in the regex and do the match \b stnads for boundries, which means no character connected :

/\bbad\b/i.test("Badkamer") // i for case-insensitive
Niels
  • 48,601
  • 4
  • 62
  • 81
0

You may try this regex:

^(.*?(\bbad\b)[^$]*)$

REGEX DEMO

Rahul Tripathi
  • 168,305
  • 31
  • 280
  • 331
0

I think the easiest way to do this would be to split the string into words, then check each word for a match, It could be done with a function like this:

    private bool containsWord(string searchstring, string matchstring)
    {
        bool hasWord = false;

        string[] words = searchstring.split(new Char[] { ' ' });
        foreach (string word in words)
        {
            if (word.ToLower() == matchstring.ToLower())
                hasWord = true;
        }
        return hasWord;
    }

The code converts everything to lowercase to ignore any case mismatches. I think you can also use RegEx for this:

static bool ExactMatch(string input, string match)

{
    return Regex.IsMatch(input.ToLower(), string.Format(@"\b{0}\b", Regex.Escape(match.ToLower())));
}

\b is a word boundary character, as I understand it. These examples are in C#. You didn't specify the language

PAUL DUFRESNE
  • 348
  • 3
  • 4
  • 11