Assumptions
From the question, the assumption is that there are no more than 2 levels of nesting brackets. It is also assumed that the brackets are balanced.
I further makes the assumption that you don't allow escaping of []
.
I also assume that when there are nested brackets, only the first opening [
and the last closing ]
brackets of the inner brackets are preserved. The rest, i.e. the top level brackets and the rest of the inner brackets are removed.
For example:
only[single] [level] outside[text more [text] some [text]moreeven[more]text[bracketed]] still outside
After replacement will become:
onlysingle level outsidetext more [text some textmoreevenmoretextbracketed] still outside
Aside from the assumptions above, there is no other assumption.
If you can make the assumption about spacing before and after brackets, then you can use the simpler solution by Denomales. Otherwise, my solution below will work without such assumption.
Solution
private static String replaceBracket(String input) {
// Search for singly and doubly bracketed text
Pattern p = Pattern.compile("\\[((?:[^\\[\\]]++|\\[[^\\[\\]]*+\\])*+)\\]");
Matcher matcher = p.matcher(input);
StringBuffer output = new StringBuffer(input.length());
while (matcher.find()) {
// Take the text inside the outer most bracket
String innerText = matcher.group(1);
int startIndex = innerText.indexOf("[");
int endIndex;
String replacement;
if (startIndex != -1) {
// 2 levels of nesting
endIndex = innerText.lastIndexOf("]");
// Remove all [] except for first [ and last ]
replacement =
// Text before and including first [
innerText.substring(0, startIndex + 1) +
// Text inbetween, stripped of all the brackets []
innerText.substring(startIndex + 1, endIndex).replaceAll("[\\[\\]]", "") +
// Text after and including last ]
innerText.substring(endIndex);
} else {
// No nesting
replacement = innerText;
}
matcher.appendReplacement(output, replacement);
}
matcher.appendTail(output);
return output.toString();
}
Explanation
The only thing that is worth explaining here is the regex. The rest you can check out the documentation of Matcher class.
"\\[((?:[^\\[\\]]++|\\[[^\\[\\]]*+\\])*+)\\]"
In RAW form (when you print out the string):
\[((?:[^\[\]]++|\[[^\[\]]*+\])*+)\]
Let us break it up (spaces are insignificant):
\[ # Outermost opening bracket
( # Capturing group 1
(?:
[^\[\]]++ # Text that doesn't contain []
| # OR
\[[^\[\]]*+\] # A nested bracket containing text without []
)*+
) # End of capturing group 1
\] # Outermost closing bracket
I used possessive quantifiers *+
and ++
in order to prevent backtracking by the regex engine. The version with normal greedy quantifier \[((?:[^\[\]]+|\[[^\[\]]*\])*)\]
would still work, but will be slightly inefficient and can cause a StackOverflowError
on big enough input.