3

I have a string, from which I want to keep text inside a pair of brackets and remove everything outside of the brackets:

Hello [123] {45} world (67)
Hello There (8) [9] {0}

Desired output:
[123] {45} (67) (8) [9] {0}

Code tried but fails:

$re = '/[^()]*+(\((?:[^()]++|(?1))*\))[^()]*+/';
$text = preg_replace($re, '$1', $text);
www.friend0.in
  • 269
  • 1
  • 2
  • 9

3 Answers3

5

If the values in the string are always an opening bracket paired up with a closing bracket and no nested parts, you can match all the bracket pairs which you want to keep, and match all other character except the brackets that you want to remove.

(?:\[[^][]*]|\([^()]*\)|{[^{}]*})(*SKIP)(*F)|[^][(){}]+

Explanation

Regex demo | Php demo

Example code

$re = '/(?:\[[^][]*]|\([^()]*\)|{[^{}]*})(*SKIP)(*F)|[^][(){}]+/m';
$str = 'Hello [123] {45} world (67)
Hello There (8) [9] {0}';

$result = preg_replace($re, '', $str);

echo $result;

Output

[123]{45}(67)(8)[9]{0}

If you want to remove all other values:

(?:\[[^][]*]|\([^()]*\)|{[^{}]*})(*SKIP)(*F)|.

Regex demo

The fourth bird
  • 154,723
  • 16
  • 55
  • 70
  • I will mark this as an answer, just need one more help What if $str has some other characters like @# also outside the bracket which I want to keep. – www.friend0.in Nov 08 '20 at 18:21
  • 1
    @www.friend0.in You can add those characters to the character class at the end to exclude matching them `(?:\[[^][]*]|\([^()]*\)|{[^{}]*})(*SKIP)(*F)|[^][(){}@#]+` See https://regex101.com/r/kGLXjU/1 – The fourth bird Nov 08 '20 at 18:34
  • Hi. Just a follow up question: I need to do the same thing but with the difference that I need the parentheses themselves to go away as well, so in your example I'd need it to return just "1234567890". What do I need to change? – Daniel Malmgren Aug 24 '21 at 09:59
2

Looks like you wanted to target nested stuff as well. There are already questions about how to match balanced parenthesis. Adjust one of those patterns to fit your needs, e.g. something like

$pattern = '/\((?:[^)(]*(?R)?)*+\)|\{(?:[^}{]*+(?R)?)*\}|\[(?:[^][]*+(?R)?)*\]/';

You can try this on Regex101. Extract those with preg_match_all and implode the matches.

if(preg_match_all($pattern, $str, $out) > 0)
  echo implode(' ', $out[0]);

If you need to match the stuff outside, even with this pattern you can use (*SKIP)(*F) that also used @Thefourthbird in his elaborately answer! For skipping the bracketed see this other demo.

bobble bubble
  • 16,888
  • 3
  • 27
  • 46
  • 1
    Haven't seen you for a while, but I remember you created some cool patterns in the past. This one is also very nice :-) +1 – The fourth bird Nov 08 '20 at 17:35
  • 2
    Good to see you again too @Thefourthbird : ) Thanks you! – bobble bubble Nov 08 '20 at 17:37
  • Great code, it also kept the space and thus improved readability. Just need one more help; What if $str has some other characters like @# outside the bracket which I want to keep. – www.friend0.in Nov 08 '20 at 18:21
  • 1
    @www.friend0.in Well, I just used a space for imploding - acutally nothing is left outside, you wrote *remove everything outside of the brackets* : ). If you want to keep the original spaces or add characters, just attach another alternation to the pattern, e.g.: [`|[@# ]`](https://regex101.com/r/MhrA3Z/1/) and implode on empty space. – bobble bubble Nov 08 '20 at 19:35
  • 1
    @www.friend0.in fyi, that you can use this pattern with `(*SKIP)(*F)` as well and match stuff outside the brackets (similar the fourth bird), [see this demo](https://regex101.com/r/YdHGh0/4) - with the difference, that it works on nested stuff. – bobble bubble Nov 09 '20 at 09:55
  • @bobblebubble Thnx a ton buddy for your help & the code – www.friend0.in Nov 10 '20 at 04:18
1

If the brackets are not nested, the following should suffice:

[^[{(\]})]+(?=[[{(]|$)

Demo.

Breakdown:

[^[{(\]})]+     # Match one or more characters except for opening/closing bracket chars.
(?=[[{(]|$)     # A positive Lookahead to ensure that the match is either followed by
                # an opening bracket char or is at the end of the string.