I'm stuck trying to understand what the regex pattern for phrases that contain variable words. So for example, if I'm scanning through a paragraph I want to be able to extract all phrases that matches this pattern, "_ colored _". The two blank spots can be anything, so both "red colored truck" and "blue colored bike" would match the regex and be extracted. It would very much appreciated if someone could help me out, thanks!
Asked
Active
Viewed 93 times
2 Answers
3
A pattern like this should work
\w+\s+colored\s+\w+
This matches any sequence of one or more word characters, followed by one or more whitespace characters, followed by the literal sequence colored
, followed by one or more whitespace characters, followed by one or more word characters.
If you want to easily extract the two words on either side, you may want to place them in capture groups, like this:
(\w+)\s+colored\s+(\w+)
If you want to find more phrases than just those which contain the word colored
, you can use an alternation, like this:
(\w+)\s+(colored|flavored|scented)\s+(\w+)
This will match strings like "blue colored bike", "cherry flavored vodka", and "bacon scented candle".
Also, because this is Java, don't forget to escape the \
characters in your string literal:
Pattern pattern = Pattern.compile("\\w+\\s+colored\\s+\\w+");

p.s.w.g
- 146,324
- 30
- 291
- 331
0
This should work for you.
Pattern samplePattern = Pattern.compile("[A-Za-z0-9._%-]+colored[A-Za-z0-9._%-]");

Ashish
- 1,121
- 2
- 15
- 25