114

I'm new to using Regex, I've been going through a rake of tutorials but I haven't found one that applies to what I want to do,

I want to search for something, but return everything following it but not the search string itself

e.g. "Some lame sentence that is awesome"

search for "sentence"

return "that is awesome"

Any help would be much appreciated

This is my regex so far

sentence(.*) 

but it returns: sentence that is awesome

Pattern pattern = Pattern.compile("sentence(.*)");

Matcher matcher = pattern.matcher("some lame sentence that is awesome");

boolean found = false;
while (matcher.find())
{
    System.out.println("I found the text: " + matcher.group().toString());
    found = true;
}
if (!found)
{
    System.out.println("I didn't find the text");
}
Manos Nikolaidis
  • 21,608
  • 12
  • 74
  • 82
Scott
  • 2,969
  • 8
  • 23
  • 23

5 Answers5

198

You can do this with "just the regular expression" as you asked for in a comment:

(?<=sentence).*

(?<=sentence) is a positive lookbehind assertion. This matches at a certain position in the string, namely at a position right after the text sentence without making that text itself part of the match. Consequently, (?<=sentence).* will match any text after sentence.

This is quite a nice feature of regex. However, in Java this will only work for finite-length subexpressions, i. e. (?<=sentence|word|(foo){1,4}) is legal, but (?<=sentence\s*) isn't.

Tim Pietzcker
  • 328,213
  • 58
  • 503
  • 561
  • 1
    You state that it should not include the positive lookbehind assertion. So I assume that ".*(?<=sentence)" should return everything up to, but not including "sentence". But it doesn't, it returns "sentence" as well. What am I missing? – JJJones_3860 Jul 03 '18 at 22:23
  • 2
    @user2184214: That's because it's a look*behind* assertion. `.*` matches any text, and then `(?<=...)` looks backwards for the word `sentence`, asserting in this case that the match ends with that word. If you want to stop before that word, you need to look *ahead*: `.*(?=sentence)` will match any text that is followed by `sentence`. – Tim Pietzcker Jul 04 '18 at 04:40
  • 1
    For anyone looking for a way to match any text after one or another string, regexps like `(?<=sentence1|sentence2).*`, ``(?:(?<=sentence1)|(?<=sentence2)).*`` or even `(?:sentence1|sentence2)(.*)` might work. – Wiktor Stribiżew Apr 23 '21 at 08:22
  • Great thanks! I was using your answer to find everything after a plus sign. So just for another example: `(?<=\+).*` – arnonuem Jul 16 '21 at 07:24
21

Your regex "sentence(.*)" is right. To retrieve the contents of the group in parenthesis, you would call:

Pattern p = Pattern.compile( "sentence(.*)" );
Matcher m = p.matcher( "some lame sentence that is awesome" );
if ( m.find() ) {
   String s = m.group(1); // " that is awesome"
}

Note the use of m.find() in this case (attempts to find anywhere on the string) and not m.matches() (would fail because of the prefix "some lame"; in this case the regex would need to be ".*sentence(.*)")

st.never
  • 11,723
  • 4
  • 20
  • 21
  • 1
    Thanks, But what if I just want it to return "that is awesome" – Scott Feb 15 '11 at 17:06
  • 1
    Thanks man, this worked great, I was hoping there was a way to do this with just the regular expression, if I cant find a way to do it that way, this will work aswell – Scott Feb 15 '11 at 17:10
  • 1
    Likely a bad idea to add a "(.*)" at the end of the regexp for the performance... – eregon Oct 15 '11 at 17:23
10

if Matcher is initialized with str, after the match, you can get the part after the match with

str.substring(matcher.end())

Sample Code:

final String str = "Some lame sentence that is awesome";
final Matcher matcher = Pattern.compile("sentence").matcher(str);
if(matcher.find()){
    System.out.println(str.substring(matcher.end()).trim());
}

Output:

that is awesome

Sean Patrick Floyd
  • 292,901
  • 67
  • 465
  • 588
2

You just need to put "group(1)" instead of "group()" in the following line and the return will be the one you expected:

System.out.println("I found the text: " + matcher.group(**1**).toString());
Perception
  • 79,279
  • 19
  • 185
  • 195
2

You need to use the group(int) of your matcher - group(0) is the entire match, and group(1) is the first group you marked. In the example you specify, group(1) is what comes after "sentence".

Rami C
  • 1,903
  • 1
  • 12
  • 14