0

I have a very long text, and a long list of words I want to find in this text.

Right now, to search those words, I check "regular expressions" and then find "word1|word2|word3|word4..." The problem with this is that if one of the words is "eat," then every word that contains "eat" is also highlighted. How can I prevent that?

WeekSky
  • 55
  • 1
  • 9
  • 2
    Possible duplicate of [Regex match entire words only](https://stackoverflow.com/questions/1751301/regex-match-entire-words-only) – Joe Mar 30 '19 at 04:28

1 Answers1

1

You can use word anchors to match the start and end of words. (Assuming you are using something that supports PCRE.)

/\b(word1|word2|word3...)\b/

The \b bit matches at a "word boundary". From Perl's regular expression man page (man perlre)

A word boundary ("\b") is a spot between two characters that has a "\w" on one side of it and a "\W" on the other side of it (in either order), counting the imaginary characters off the beginning and end of the string as matching a "\W".

Ashton Wiersdorf
  • 1,865
  • 12
  • 33
  • I get the error "Bad regex: Error while compiling regular expression" (Nevermind, now what happens is that nothing is found while results should definitely show up.) – WeekSky Mar 30 '19 at 04:34
  • What tool are you using? `grep`? Is this inside a program? What language? What libraries do you have access to? – Ashton Wiersdorf Mar 30 '19 at 04:38
  • You might also want to try escaping the backslashes if your regular expression is inside a string: `"\\b(word1|word2)\\b"`. Again, this depends on what tool/language you are using. – Ashton Wiersdorf Mar 30 '19 at 04:39
  • I am using notepad++ on windows and geany on Linux. It doesn't work on either of those. – WeekSky Mar 30 '19 at 04:42
  • Try taking the forward slashes (`/`) off of the front and the back. Just play with it. (forward slashes are often used to delimit regular expressions in many languages—might not be what your editor is using) – Ashton Wiersdorf Mar 30 '19 at 04:50
  • Also, for your future questions, details like what language/system you are doing something in are extremely important. As you might be able to see, this question has been flagged as duplicate—it's too broad. You could add the caveat that you're doing this in $EDITOR_NAME, and that will help you get better answers quicker. Just a tip. :) – Ashton Wiersdorf Mar 30 '19 at 04:54