1

I am looking for a regular expression to split a string on commas. Sounds very simple, but there is another restriction. The parameters on the string could have commas surrounded by parenthesis which should not split the string.

Example:

1, 2, 3, add(4, 5, 6), 7, 8
 ^  ^  ^      !  !   ^  ^

The string should only be splitted by the commas marked with ^ and not with !.

I found a solution for it here: A regex to match a comma that isn't surrounded by quotes

Regex:

,(?=([^\(]*\([^\)]*\))*[^\)]*$)

But my string could be more complex:

1, 2, 3, add(4, 5, add(6, 7, 8), 9), 10, 11
 ^  ^  ^      !  !      !  !   !   ^   ^

For this string the result is wrong and i have no clue how to fix this or if it even is possible with regular expressions.

Have anyone an idea how to resolve this problem?

Thanks for your help!

Community
  • 1
  • 1
Benjamin Schüller
  • 2,104
  • 1
  • 17
  • 29

2 Answers2

2

Ok, I think a regular expression is not very useful for this. A small block of java might be easier.

So this is my java code for solving the problem:

public static void splitWithJava() {
    String EXAMPLE = "1, 2, 3, add(4, 5, add(7, 8), 6), 7, 8";
    List<String> list = new ArrayList<>();
    int start = 0;
    int pCount = 0;
    for (int i = 0; i < EXAMPLE.length(); i++) {
      char c = EXAMPLE.charAt(i);
      switch (c) {
      case ',': {
        if (0 == pCount) {
          list.add(EXAMPLE.substring(start, i).trim());
          start = i + 1;
        };
        break;
      }
      case '(': {
        pCount++;
        break;
      }
      case ')': {
        pCount--;
        break;
      }
      }
    }
    list.add(EXAMPLE.substring(start).trim());
    for (String str : list) {
      System.out.println(str);
    }
  }
Benjamin Schüller
  • 2,104
  • 1
  • 17
  • 29
1

You can also achieve this using this regex: ([^,(]+(?=,|$)|[\w]+\(.*\)(?=,|$))

regex online demo

Considering this text 1, 2, 3, add(4, 5, add(6, 7, 8), 9), 10, 11 it creates groups based on commas (not surrounded by ())

So, the output would be:

Match 1
Group 1.    0-1    `1`

Match 2
Group 1.    2-4    ` 2`

Match 3
Group 1.    5-7    ` 3`

Match 4
Group 1.    9-35    `add(4, 5, add(6, 7, 8), 9)`

Match 5
Group 1.    36-39    ` 10`

Match 6
Group 1.    40-43    ` 11`
Caio Oliveira
  • 1,243
  • 13
  • 22
  • 1
    I asked for a regEx solution and your answer looks pretty good at my tests. So i vote yours as the correct answer. But my tests show also that the java method i posted as an answer is a little bit faster than this. Therefore i will use the java method to parse the string. – Benjamin Schüller Nov 18 '16 at 07:46
  • black magic.. I was trying to achieve this with a recursive pattern when I threw in the towel and found this. Bravo – Brad Kent Dec 20 '17 at 18:44