-2

I am trying to clean a string in PHP using the following code, but I am not sure how to get rid of the text inside brackets and parentheses at the same time with any whitespace before if present.

The code I am using is:

$string = "Deadpool 2 [Region 4](Blu-ray)";
echo preg_replace("/\[[^)]+\]/","",$string); 

The output I'm getting is:

Deadpool [](Blu-ray)

However, the desired output is:

Deadpool 2

Using the solutions from this and this questions, it is not clear how to remove both one type of matches and the other one while also removing the optional whitespace before them.

double-beep
  • 5,031
  • 17
  • 33
  • 41
snowflakes74
  • 1,307
  • 1
  • 20
  • 43
  • 1
    @snowflakes With only one sample string we can only offer you a general solution. If you give 10 different strings, some of which should reveal any fringe cases, we can develop our own logic for your task -- instead of blindly accepting your logic. I see that you've received an acceptable answer, but with an improved question, we may be able to offer you a superior answer. – mickmackusa May 10 '19 at 11:00
  • 7
    This question is being discussed [on meta](https://meta.stackoverflow.com/questions/384914/answerer-adds-to-question-requirements-in-order-to-re-open-it). – yivi May 12 '19 at 10:59

2 Answers2

6

There are four main points here:

  • String between parentheses can be matched with \([^()]*\)
  • String between square brackets can be matched with \[[^][]*] (or \[[^\]\[]*\] if you prefer to escape literal [ and ], in PCRE, it is stylistic, but in some other regex flavors, it might be a must)
  • You need alternation to match either this or that pattern and account for any whitespaces before these patterns
  • Since after removing these strings you may get leading and trailing spaces, you need to trim the string.

You may use

$string = "Deadpool 2 [Region 4](Blu-ray)";
echo trim(preg_replace("/\s*(?:\[[^][]*]|\([^()]*\))/","", $string)); 

See the regex demo and a PHP demo.

The \[[^][]*] part matches strings between [ and ] having no other [ and ] inside and \([^()]*\) matches strings between ( and ) having no other parentheses inside. trim removes leading/trailing whitespace.

Regex graph and explanation:

enter image description here

  • \s* - 0+ whitespaces
  • (?: - start of a non-capturing group:
    • \[[^][]*] - [, zero or more chars other than [ and ] (note you may keep these brackets inside a character class unescaped in a PCRE pattern if ] is right after initial [, in JS, you would have to escape ] by all means, [^\][]*)
    • | - or (an alternation operator)
    • \([^()]*\) - (, any 0+ chars other than ( and ) and a )
  • ) - end of the non-capturing group.
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
2

Based on just the one sample input there are some simpler approaches.

$string = "Deadpool 2 [Region 4](Blu-ray)";
var_export(preg_replace("~ [[(].*~", "", $string));

echo "\n";

var_export(strstr($string, ' [', true));

Output:

'Deadpool 2'
'Deadpool 2'

These assume that the start of the unwanted substring begins with space opening square brace.

The strstr() technique requires that the space-brace sequence exists in the string.

If the unwanted substring marker is not consistently included, then you can use:

var_export(explode(' [', $string, 2)[0]);

This will put the unwanted substring in explode's output array at [1] and the wanted substring in [0].

mickmackusa
  • 43,625
  • 12
  • 83
  • 136