0

How do I build a regex pattern that searches over a text T and tries to find a search string S.

There are 2 requirements:

  1. S could be made of any character.
  2. S could be anywhere in the string but can't be part of a word.

I know that in order to escape special regex characters I put the search string between \Q and \E as such:

\EMySearch_String\Q

How do I prevent finding partial matching of S in T?

ben39
  • 2,427
  • 4
  • 20
  • 19
  • 1
    What do you mean by "can't be part of a word"? Does it mean it must be the whole word? Could you give some positive and negative examples? – walrii Jul 16 '12 at 04:25
  • 1
    Have you looked at the solution to this http://stackoverflow.com/questions/5091057/how-to-find-a-whole-word-in-a-string-in-java does that help you? – vcetinick Jul 16 '12 at 04:30
  • @vcetinick: `\\b` is no good here, since `S` could include any character. – Keppil Jul 16 '12 at 06:09

2 Answers2

1

You can do like this if
can't be part of a word
is interpreted as
preceded by start-of-string or space and followed by end-of-string or space:

String s = "3894$75\\/^()";
String text = "fdsfsd3894$75\\/^()dasdasd 22348 3894$75\\/^()";
Matcher m = Pattern.compile("(?<=^|\\s)\\Q" + s + "\\E(?=\\s|$)").matcher(text);
while (m.find()) {
    System.out.println("Found match! :'" + m.group() + "'");
}

This prints only one

Found match! :'3894$75/^()'

Keppil
  • 45,603
  • 8
  • 97
  • 119
0

I think what you're trying to find can be easily solved with lookaheads and lookbehinds. Take a look at this for a good explanation.

Then there's a bit of flip-flopping booleans, but you're looking ahead and behind for NOT Non-Space characters (\S). You don't want to look for space characters only because S might be at the start or end of the string. Like so:

(?<!\S)S(?!\S)
VolatileRig
  • 2,727
  • 6
  • 32
  • 44