0

I have a regex that matches these strings in a string; however, it is matching non-words ( parts-of-words ) as well.

For example city is matched as it contains it. However, I want only the string it to be matched it if it the only characters between whitespace. So it or he would match, but not city or where.

Here is the regex ( pretty basic and simple ): they|he|she|her|him|them|it.

How can I get it to match these words if the word is only this?

wordSmith
  • 2,993
  • 8
  • 29
  • 50
  • *" I want only the string it to be matched it if it the only characters between whitespace"* add whitespaces to your expression then? – Felix Kling Jun 19 '14 at 19:27

2 Answers2

2

Use word boundaries to denote the beginning and ending of a word.

http://www.regular-expressions.info/wordboundaries.html

So your regex would become something on the order of:

\b(they|he|she|her|him|them|it)\b

Check it out

It should be noted that this regular expression won't match words containing apostrophes, e.g. can't, won't, etc. For a discussion of this, see the following Stackoverflow post:

How do you use the Java word boundary with apostrophes?

Community
  • 1
  • 1
Willy
  • 1,055
  • 8
  • 23
0

Try to put an word boundary before the words,

(?:\bthey\b|\bhe\b|\bshe\b|\bher\b|\bhim\b|\bthem\b|\bit\b)

Explanation:

(?:...) # Non captuaring groups
\b      # Word boundary(It matches between a word character and a non word character)

DEMO

Avinash Raj
  • 172,303
  • 28
  • 230
  • 274