2

Good morning, I need a little of your help. I need to split the text using REGEX, but to omit the content in parentheses

preg_match_all('/\((?:[^()]|(?R))+\)|\'[^\']*\'|[^(),]+/', $input_lines, $output_array);

I have this string: Test A, Test B, Test C (data1, data1)

And pregmatch we do this:

0   =>  Test A
1   =>   Test B
2   =>   Test C 
3   =>  (data1, data1)

How do I achieve this result?

0   =>  Test A
1   =>  Test B
2   =>  Test C (data1, data1)

I need to ignore the content in parentheses and separate only the rest.

Thank you in advance for any help.

EDIT

This aventually resolved my situation. I tried to use preg split.

preg_split('/,(?![^(]*\)) /', $input_line);

  • For the given string, and without any more specific requirements or restrictions, I would go with just `explode(',', 'Test A, Test B, Test C (data1, data1)', 3)` instead of regex … – misorude Feb 19 '20 at 08:43
  • ```Array ( [0] => Test A [1] => Test B [2] => Test C (data1, data1), Test D (data1, data1), Test E (data1, data1) )``` – Michael Fanta Feb 19 '20 at 08:54
  • So what you are trying to say by that now, is that you need to match more than what the example you gave actually showed? Well then give a better example, resp. a _proper_ explanation of what exactly you need to match to begin with. – misorude Feb 19 '20 at 08:56
  • 1
    I have more strings and you need to divide them by separator (comma) so as to ignore (comma) in parentheses because parenthesis is part of the parameter. – Michael Fanta Feb 19 '20 at 09:11
  • Teď jsem zkoušel tohle: ```preg_split('/,(?![^(]*\)) /', $input_line);``` a to mi zatím funguje dle potřeb, – Michael Fanta Feb 19 '20 at 09:13

1 Answers1

2

What you might do is use the recursive pattern to recurse the first subpattern using (?1) to match all the parenthesis so the split does not split inside the parenthesis as well and use SKIP FAIL.

Then split on a comma followed by 0+ horizontal whitespace chars

(\((?:[^()]++|(?1))*\))(*SKIP)(*F)|,\h*

Regex demo | Php demo

$re = '/(\((?:[^()]++|(?1))*\))(*SKIP)(*F)|,\h*/';
$strings = [
    "Test A, Test B, Test C (data1, data1)",
    "Test A, Test B, Test C (data1, data1), Test D (data1, data1), Test E (data1, data1(data, data))",
    "Test A, Test B, Test C (data1, data1), Test D (data1, data1), Test E ((data1, data1))"
];

foreach($strings as $s) {
    print_r(preg_split($re, $s));
}

Output

Array
(
    [0] => Test A
    [1] => Test B
    [2] => Test C (data1, data1)
)
Array
(
    [0] => Test A
    [1] => Test B
    [2] => Test C (data1, data1)
    [3] => Test D (data1, data1)
    [4] => Test E (data1, data1(data, data))
)
Array
(
    [0] => Test A
    [1] => Test B
    [2] => Test C (data1, data1)
    [3] => Test D (data1, data1)
    [4] => Test E ((data1, data1))
)
The fourth bird
  • 154,723
  • 16
  • 55
  • 70