7

I would like to split a string in java on a comma(,) but whenever the comma(,) is in between some parenthesis, it should not be split.

e.g. The string :

"life, would, (last , if ), all"

Should yield:

-life
-would
-(last , if )
-all

When I use :

String text = "life, would, (last , if ), all"
text.split(",");

I end up dividing the whole text even the (last , if ) I can see that split takes a regex but I can't seem to think of how to make it do the job.

Barett
  • 5,826
  • 6
  • 51
  • 55
Bwire
  • 1,181
  • 1
  • 14
  • 25
  • Is it possible for input to have nested brackets, `foo, (bar, (baz))` or unbalanced ones `foo (bar, baz) bam)`? – Pshemo Aug 13 '15 at 16:04
  • @Pshemo Sure it can have nested brackets. Since in the application it is an arbitrary sql function that can be used to create any required report, i am sure it can come up. – Bwire Aug 13 '15 at 16:14
  • 1
    Then regex (at least one from Java - since it doesn't support recursion) is wrong tool. You should write your own parser in which you will remember nesting level and split only when you found `,` and nesting level is 0. – Pshemo Aug 13 '15 at 16:17
  • 1
    Here you have example of such parser http://stackoverflow.com/a/16108347/1393766 – Pshemo Aug 13 '15 at 16:19
  • @Pshemo thanks for the eagles eye. That would surely have created a bug in the near future. – Bwire Aug 13 '15 at 16:32
  • You are welcome. This is common mistake, but it could be avoided by describing your data in more detail. If you would mention how you are going to use this code someone could even suggest you some proper SQL parser. – Pshemo Aug 13 '15 at 16:35

1 Answers1

10

you could use this pattern - (not for nested parenthesis)

,(?![^()]*\))

Demo

,               # ","
(?!             # Negative Look-Ahead
  [^()]         # Character not in [()] Character Class
  *             # (zero or more)(greedy)
  \             # 
)               # End of Negative Look-Ahead
)
alpha bravo
  • 7,838
  • 1
  • 19
  • 23
  • I have tested nested parenthesis and the regex seems to fail. Is there a way this can be overcome. – Bwire Aug 13 '15 at 16:27
  • @Bwire, hence my disclaimer above `(not for nested parenthesis)` but to answer your question not with java you need an PCRE engine with recursion [example](https://regex101.com/r/yW4aZ3/277) – alpha bravo Aug 13 '15 at 20:51