0

I have the following string:

"hello this.is.a.test(MainActivity.java:47)"

and I want to be able to extract the MainActivity.java:47 (everything that is inside '(' and ')' and only the first occurance).

I tried with regex but it seems that I am doing something wrong.

Thanks

Jon Romero
  • 4,062
  • 6
  • 36
  • 34
  • posiible duplicates of http://stackoverflow.com/questions/4749549/extract-substring-in-java-using-regex and http://stackoverflow.com/questions/4662215/java-how-to-extract-a-substring-using-regex – Saurabh Gokhale Mar 17 '11 at 14:28
  • Do you specifically _need_ to use a regex, or is any working method sufficient? – Pops Mar 17 '11 at 14:30
  • it looks like you're trying to parse a stacktrace. It comes from a text file, or do you have access to the Exception object being created? – Soronthar Mar 17 '11 at 14:42

6 Answers6

4

You can do it yourself:

int pos1 = str.indexOf('(') + 1;
int pos2 = str.indexOf(')', pos1);

String result = str.substring(pos1, pos2)

Or you can use commons-lang which contains a very nice StringUtils class that has substringBetween()

Aaron Digulla
  • 321,842
  • 108
  • 597
  • 820
1

I think Regex is a liitle bit an overkill. I would use something like this:

String input = "hello this.is.a.test(MainActivity.java:47)";
String output = input.subString(input.lastIndexOf("(") + 1, input.lastIndexOf(")"));
1

This should work:

^[^\\(]*\\(([^\\)]+)\\)

The result is in the first group.

Mihai Toader
  • 12,041
  • 1
  • 29
  • 33
  • 1
    Sure you *could* do it with a regex, but when it looks like that, why bother? – Daniel DiPaolo Mar 17 '11 at 14:30
  • 1
    ACtually the actual regex is much simpler :). I just complicated it to make sure it works as intended :). This will work too: `\\((.+?)\\)` – Mihai Toader Mar 17 '11 at 14:33
  • Two reasons to bother about regexes: the "lastIndexOf" method will iterate over the chars of the string twice (once for the "(", once for the ")") while a good regex implementation will iterate over the chars just once (twice, in the worst case, depending on the output). The second is that you may have unbalanced (). In the special case of the OP, they are balance so my second point is moot. – Soronthar Mar 17 '11 at 14:37
  • ... and the doubled backslashes are not part of the regexp, only necessary for the Java String literal. – Paŭlo Ebermann Mar 17 '11 at 14:45
1

Another answer for your question :


String str = "hello this.is.a.test(MainActivity.java:47) another.test(MyClass.java:12)";
Pattern p = Pattern.compile("[a-z][\\w]+\\.java:\\d+", Pattern.CASE_INSENSITIVE);
Matcher m=p.matcher(str);

if(m.find()) {
    System.out.println(m.group());
}

The RegExp explained :

[a-z][\w]+\.java:\d+

[a-z] > Check that we start with a letter ...
[\w]+ > ... followed by a letter, a digit or an underscore...
\.java: > ... followed exactly by the string ".java:"...
\d+ > ... ending by one or more digit(s)

Stephan
  • 41,764
  • 65
  • 238
  • 329
0

Pseudo-code:

int p1 = location of '('
int p2 = location of ')', starting the search from p1
String s = extract string from p1 to p2

String.indexOf() and String.substring() are your friends.

Bombe
  • 81,643
  • 20
  • 123
  • 127
0

Try this:

String input = "hello this.is.a.test(MainActivity.java:47) (and some more text)";
Pattern p = Pattern.compile("[^\\)]*\\(([^\\)]*)\\).*");
Matcher m = p.matcher( input );
if(m.matches()) {
  System.out.println(m.group( 1 )); //output: MainActivity.java:47
}

This also finds the first occurence of text between ( and ) if there are more of them.

Note that in Java you normally have the expressions wrapped with ^ and $ implicitly (or at least the same effect), i.e. the regex must match the entire input string. Thus [^\\)]* at the beginning and .* at the end are necessary.

Thomas
  • 87,414
  • 12
  • 119
  • 157