So I have a String of text lets say "hello world () (foo bar) (foo bar 2 (this looks cozy)) (foo bar 3..."
Is there a regex pattern I could use that will get the parentheses and include any parentheses inside them to nth depth.
So the matches would be "()"
, "(foo bar)"
, "(foo bar 2 (this looks cozy))"
, ...
?
Asked
Active
Viewed 121 times
-1

NotAFlyingGoose
- 111
- 1
- 10
-
In Java regex engine doesn't support recursion so regex may not be best tool here. – Pshemo Aug 09 '20 at 22:19
-
@Pshemo well is there way to do something similar? – NotAFlyingGoose Aug 09 '20 at 22:25
-
I'd use a loop over the string, not regex – Ecto Aug 09 '20 at 22:29
-
2Thats a classical example for a language that is **not regular** and thus cannot be parsed by regular expressions (see pumping-lemma). Its context-free. – Polygnome Aug 09 '20 at 22:59
-
Summary of comments: you need a DPDA, i.e. a parser with an explicit or implicit stack, not a regular expression. – user207421 Aug 09 '20 at 23:26
1 Answers
1
Regex flavor in Java doesn't support recursion like some other flavors do. Instead you can write your own method which will build strings from characters only if they are:
(
)
- inside parenthesis.
To know if currently handled character is inside parenthesis we can create counter which will check parenthesis balance (you can also think of it as counter for nesting level). In short: if we saw more (
than )
then we are inside unclosed (open) parenthesis section, so we should add current character to resulting string.
Using that idea our code can look like:
String str = "hello world () (foo bar) (foo bar 2 (this looks cozy)) (foo bar 3...)";
List<String> result = new ArrayList<>();
int parenthesisNestingLevel = 0;
StringBuilder sb = new StringBuilder();
for (char ch : str.toCharArray()) {
if (ch == '(') {
parenthesisNestingLevel++;
sb.append(ch);
} else if (ch == ')') {
parenthesisNestingLevel--;
sb.append(ch);
if (parenthesisNestingLevel == 0) {
result.add(sb.toString());
sb.delete(0, sb.length());//reset sb
}
} else if (parenthesisNestingLevel > 0) {//we are inside unclosed parenthesis
sb.append(ch);
}
}
result.forEach(System.out::println);
Output:
()
(foo bar)
(foo bar 2 (this looks cozy))
(foo bar 3...)

Pshemo
- 122,468
- 25
- 185
- 269
-
No 'yet' about it. Regular expressions can't support recursion by definition. Chomsky hierarchy 1956. – user207421 Aug 09 '20 at 23:41
-
@MarquisofLorne Thank you for information. Will need to study it since I am not rally familiar with regex history an theories behind it. But now what confuses me (and is main reason why I added "yet") is existence of `(?R)` in some regex flavors. Based on example https://regex101.com/r/NLElp8/1 from https://stackoverflow.com/a/35271017 it looks like it represents recursion. So are those flavors no longer proper regex? – Pshemo Aug 10 '20 at 00:30
-
1That is correct. They are no longer proper regular expressions, by definition. – user207421 Aug 10 '20 at 09:55
-