1

I split a string by comma, but not within parathesis, using preg_split. I came up with

preg_split('#,(?![^\(]*[\)])#',$str);

which works perfectly unless there is a comma before a nested parenthesis.

Works for

$str = "first (1,2),second (child (nested), child2), third";

Array
(
    [0] => first (1,2)
    [1] => second (child (nested), child2)
    [2] =>  third
)

but not for

$str = "first (1,2),second (child, (nested), child2), third";

Array
(
    [0] => first (1,2)
    [1] => second (child
    [2] =>  (nested), child2)
    [3] =>  third
)
Raymond Chen
  • 44,448
  • 11
  • 96
  • 135
Googlebot
  • 15,159
  • 44
  • 133
  • 229

2 Answers2

2

Looking at the requirement of ignoring , which are inside the brackets, this problem just boils down to making sure the brackets are balanced. If any , resides inside an unbalanced parenthesis, we ignore them, else that , is our delimiter now for the split.

To collect strings in-between these ,, we maintain a start pointer $sub_start to keep track of substrings' start index and update them after we come across a valid delimiter ,.

Snippet:

<?php

function splitCommaBased($str){
    $open_brac = 0;
    $len = strlen($str);
    $res = [];
    $sub_start = 0;
    
    for($i = 0; $i < $len; ++$i){
        if($str[ $i ] == ',' && $open_brac == 0){
            $res[] = substr($str, $sub_start, $i - $sub_start);
            $sub_start = $i + 1;
        }else if($str[ $i ] == '('){
            $open_brac++;
        }else if($str[ $i ] == ')'){
            $open_brac--;
        }else if($i === $len - 1){
            $res[] = substr($str, $sub_start);
        }
    }
    
    return $res;
}

print_r(splitCommaBased('first (1,2),second (child, (nested), child2), third'));
nice_dev
  • 17,053
  • 2
  • 21
  • 35
2

You can use recursion matching the balanced parenthesis. Then make use of SKIP FAIL and match the comma to split on.

(\((?:[^()]++|(?1))*\))(*SKIP)(*F)|,

See a regex demo.

Example

$str = "first (1,2),second (child, (nested), child2), third";
$pattern = "/(\((?:[^()]++|(?1))*\))(*SKIP)(*F)|,/";
print_r(preg_split($pattern, $str));

Output

Array
(
    [0] => first (1,2)
    [1] => second (child, (nested), child2)
    [2] =>  third
)
The fourth bird
  • 154,723
  • 16
  • 55
  • 70