4
the
cat
sat
on
the
mat

assuming those are different entries. what would the regex expression be to exclude a specific character, in this case "a", from anywhere at all in the thing you were searching for?

so the hits you would get back are "the, on, the"


or if it was a word as in

I like chocolate
bananas
chocolate cake

I would like only "bananas" to show a hit by excluding the word "chocolate" anywhere

Patrick Artner
  • 50,409
  • 9
  • 43
  • 69
  • sorry for formatting. stackoverflow has gotten rid of where i started each new word / string as paragraph. not my fault. – youknownothingjonsnow Jan 07 '18 at 17:29
  • hey patrick, how did you do that? thanks – youknownothingjonsnow Jan 07 '18 at 17:34
  • Its called "code blocks" . put an empty line before code and indent by 4 spaces OR select all and click the **{}** button - adding an empty line between things often helps too – Patrick Artner Jan 07 '18 at 17:39
  • 4
    @youknownothingjonsnow: https://stackoverflow.com/editing-help – Ronan Boiteau Jan 07 '18 at 17:39
  • the first one you could do by `^[^a]+$` and multiline flag, the latter is trickier - depends on the language used - regex syntax has _flavours_ (js, c#, php, ...) - best go over to one of theonline regex-testers - I use regexr.com but there are lots to experiment with - google em – Patrick Artner Jan 07 '18 at 17:41
  • Patrick, your first answer to my first one, am I to understand it in natural language as "all entries that starts and ends without any requirement but without any multiples of a in the middle"? – youknownothingjonsnow Jan 07 '18 at 17:45
  • What programming language do you use? – Jan Jan 07 '18 at 18:33
  • hi Jan. I don't know. I am on a 1 year conversion degree for computer science and we have been introduced to regex without reference to any specific programming languages. – youknownothingjonsnow Jan 07 '18 at 18:59
  • You should mention this or use a tag like [tag:language-agnostic]. Unfortunately, the answers you have received so far are specific to a group of regex dialects which is far removed from the computer science concept of regular languages. You might want to accept the duplicate I proposed just to buy time to ask a properly-scoped question before more enthusiastic newbies rush in with virtually identical answers. – tripleee Jan 07 '18 at 19:50
  • (... Though [stevendesu's comment](https://stackoverflow.com/questions/406230/regular-expression-to-match-a-line-that-doesnt-contain-a-word#comment9209422_406230) on that question is basically the answer for traditional / computer-science-y regular expressions.) – tripleee Jan 07 '18 at 19:53

2 Answers2

19

What you need is a negative lookahead for blacklisted word or character.

Following regex does what you are expecting.

Regex: ^(?!.*a).*$

Explanation:

(?!.*a) let's you lookahead and discard matching if blacklisted character is present anywhere in the string.

.* simply matches whole string from beginning to end if blacklisted character is not present.

Regex101 Demo


For blacklisting a word you will have to modify and mention word in negative lookahead assertion.

Regex: ^(?!.*chocolate).*$

Regex101 Demo

This will also discard match if chocolate is a part of string like blackchocolate hotchocolate etc.


Strict matching of word by adding word boundaries.

Regex: ^(?!.*\bchocolate\b).*$

By adding \b on both ends it will strictly lookahead for chocolate and discard match if present.

Regex101 Demo

Rahul
  • 2,658
  • 12
  • 28
8

Your question is a bit vaguely phrased, in the end you'll have a couple of choices.

The first regex-solution (disallow characters in a word):

\b(?:(?!a)\w)+\b
# word boundary, neg. lookahead, disallowing "a",
# afterwards match as many word characters as possible
# in the end another word boundary

See a demo on regex101.com.


The second regex-solution (disallow a complete word):

^(?!.*chocolate).+
# match the start of the line, additionally a neg. lookahead looking down the line

See another demo on regex101.com.


Programmatically:

Assuming Python, transferable to other languages as well:

sentence = "the cat sat on the mat"
words_without_a = [word for word in sentence.split() if "a" not in word]
print(words_without_a)
# ['the', 'on', 'the']
Jan
  • 42,290
  • 8
  • 54
  • 79
  • thank you! may I ask what the purpose of the ?: is? in the first example, adding it or removing it will still return the same 3 hits. but it changes the colour of captured groups. – youknownothingjonsnow Jan 07 '18 at 19:08
  • @youknownothingjonsnow: `(?:...)` is a non-capturing group, meaning it has the functionality of a group but does not capture anything separately. – Jan Jan 07 '18 at 19:10
  • @youknownothingjonsnow: If an answer helped you, please consider upvoting/accepting it (green tick on the left). – Jan Jan 07 '18 at 19:49