2

I'm trying to explode a string by vertical bars. That's the easy part. However, I DON'T want the split to affect substrings that are surrounded by parentheses. That means I need a string such as:

Hello (sir|maam).|Hi there!

to explode into:

Array
(
    [0] => Hello (sir|maam).
    [1] => Hi there!
)

By using the normal explode function, I don't believe there is a way to tell it to ignore that bar surrounded by the parentheses. However, I have some ideas.

I know that it would be possible to do this by exploding the string normally, and then looping through the array and merging everything between strings that contain ( to the closing string that contains ). However, I have a feeling that there should be a more elegant way of achieving this.

Am I right? Is there a less code-intensive means of spliting a string into an array given these restrictions?

Nathanael
  • 6,893
  • 5
  • 33
  • 54

2 Answers2

3

If you can guarantee the parentheses will be balanced and never nested (that is, if there will never be a 'Oops(!' or a '(nested stuff (like this)|oops)'), and there will never be a || outside of parentheses that you care to match as an empty string, then this ought to help:

preg_match_all('/(?:[^(|]|\([^)]*\))+/', $your_string, $matches);
$parts = $matches[0];

It'll match [either (a character that's not a | or (), or a ( and ) enclosing anything that's not a ) (which includes |)], as many times as possible (but at least once). Short version: it'll make | between parentheses part of the match, rather than a separator.

Another possibility, that is slightly less cryptic:

$parts = preg_split('/\|(?![^(]*\))/', $your_string);

Uses a lookahead assertion to disqualify any | that's followed by a ) if there's not a ( in between. Still a bit unforgiving about parens, but it will match empty strings between two |s.

cHao
  • 84,970
  • 20
  • 145
  • 172
1

Until someone writes a regex based solution, which I doubt is possible with a single pass, this should work. It is a straightforward translations of requirements to the code.

<?php
function my_explode($str)
{
    $ret = array(); $in_parenths = 0; $pos = 0;
    for($i=0;$i<strlen($str);$i++)
    {
        $c = $str[$i];

        if($c == '|' && !$in_parenths) {
            $ret[] = substr($str, $pos, $i-$pos);
            $pos = $i+1;
        }
        elseif($c == '(') $in_parenths++;
        elseif($c == ')') $in_parenths--;
    }
    if($pos > 0) $ret[] = substr($str, $pos);

    return $ret;
}

$str = "My|Hello (sir|maam).|Hi there!";
var_dump(my_explode($str));
Goran Rakic
  • 1,789
  • 15
  • 26