I want to split like this:
Before:
TEST_A, TEST_B, TEST_C (with A, B, C), TEST_D
After:
TEST_A
TEST_B
TEST_C (with A, B, C)
TEST_D
How can I split it?
I want to split like this:
TEST_A, TEST_B, TEST_C (with A, B, C), TEST_D
TEST_A
TEST_B
TEST_C (with A, B, C)
TEST_D
How can I split it?
Regex isn’t going to help this time, so you will have to iterate through the characters.
Fact is, regular expressions aren’t very context-aware. For that reason, you can’t use regular expression to parse HTML. This is why we’re better off iterating through the string ourselves.
function magic_split($str) {
$sets = array(''); // Sets of strings
$set_index = 0; // Remember what index we’re writing to
$brackets_depth = 0; // Keep track if we’re in brackets (or not)
// Iterate through entire string
for($i = 0; $i < strlen($str); $i++) {
// Skip commas if we’re not in brackets
if($brackets_depth < 1 && $str[$i] === ',') continue;
// Add character to current list
$sets[$set_index] .= $str[$i];
// Store brackets depth
if($str[$i] === '(') $brackets_depth++;
if($str[$i] === ')') $brackets_depth--;
if(
$i < strlen($str) - 1 && // Is a next char available?
$str[$i+1] === ',' && // Is it a comma?
$brackets_depth === 0 // Are we not in brackets?
) $sets[++$set_index] = ''; // Add new set
}
return $sets;
}
$input = 'TEST_A, TEST_B, TEST_C (with A, B, C), TEST_D';
$split = magic_split($input);
You want to match:
PHP Code:
$ar=preg_split("#([^(,]+(?:\([^(]+\))?),[\s]*#", "$input,", -1,
PREG_SPLIT_DELIM_CAPTURE |PREG_SPLIT_NO_EMPTY)
Edit: it does not work if you don't have coma outside the parenthesis. you'll have to add an extra coma after $input like modified above.
The correct solution to this problem will depend on exactly what your specification is for identifying individual elements.
If you expect each one to begin with TEST_
, then you could solve it fairly simply with a regular expression:
$input = 'TEST_A, TEST_B, TEST_C (with A, B, C), TEST_D';
$matches = preg_split('/,\s*(?=TEST_)/', $input);
var_dump($matches);
Output:
array(4) {
[0]=>
string(6) "TEST_A"
[1]=>
string(6) "TEST_B"
[2]=>
string(21) "TEST_C (with A, B, C)"
[3]=>
string(6) "TEST_D"
}
This splits the string on commas followed by whitespace, using a lookahead assertion test for the presence of TEST_
at the beginning of the next item.
You merely need to explode on comma-space and disregard any comma-spaces that are inside of parentheses. (*SKIP)(*FAIL)
will consume all parenthetical expressions and dispose of them so that they are not used as delimiters.
Code: (Demo)
$text = 'TEST_A, TEST_B, TEST_C (with A, B, C), TEST_D';
var_export(preg_split('~\([^)]*\)(*SKIP)(*FAIL)|, ~', $text));
Output:
array (
0 => 'TEST_A',
1 => 'TEST_B',
2 => 'TEST_C (with A, B, C)',
3 => 'TEST_D',
)