1

The input is a comma-separated list of fields.

Here is an example.

tna,performance,ma[performance,3],price

The issue is that some of the "fields" have parameters specified in square brackets and those parameters also have commas.

What RegEx could I use to break a string like that on commas, only when they are outside of brackets. I want the end result to be

tna
performance
ma[performance,3]
price
Anirudha
  • 32,393
  • 7
  • 68
  • 89
user1044169
  • 2,686
  • 6
  • 35
  • 64
  • Just a thought: Square brackets are a special character in a regex. Your regex will be easier to read and maintain if you convert the square brackets to angle brackets before the processing, then convert them back to square brackets after processing (if needed). There'd be a performance hit, however. – JDB Sep 06 '12 at 17:51
  • possible duplicate of [How to split string by ',' unless ',' is within brackets using Regex?](http://stackoverflow.com/questions/732029/how-to-split-string-by-unless-is-within-brackets-using-regex) – tripleee Sep 06 '12 at 18:53

2 Answers2

3

This is what you need

(?<!\[[\w,]*?),

If brackets are nested within brackets, use this because the above would fail in that scenario..

(?<!\[[\w,]*?),(?![\w,]*?\])

works here

nbanic
  • 1,270
  • 1
  • 8
  • 11
Anirudha
  • 32,393
  • 7
  • 68
  • 89
  • 1
    Few languages support `*` and `+` inside look-behinds though. – Bart Kiers Sep 06 '12 at 18:07
  • Yes, .NET supports it. Perl also, I believe, but only in more recent versions of Perl, if I'm not mistaken. Ah, wait, I just saw the OP mentioned using .NET. – Bart Kiers Sep 06 '12 at 18:13
  • @BartKiers .net's regex is based on perl..maybe perl and .net hardly have any difference in them – Anirudha Sep 06 '12 at 18:18
  • Many regex implementations mention they're PCRE (Perl Compatible REgex), but pretty much all of them, if not all (!), differ slightly in syntax and/or functionality from Perl's. – Bart Kiers Sep 06 '12 at 18:20
  • 1
    I may be a bit biased, but .NET seems to have one of the more powerful and feature-rich regex implementations. Balanced grouping is one of my favorites. – JDB Sep 06 '12 at 19:04
1

Try this :

"[a-z0-9]*(\\[[a-z0-9\\[\\],]+\\])*"
sandyiscool
  • 561
  • 2
  • 9
  • Close. The zero-or-more match gives you a lot of empty results in your Match collection, and the character class can match anywhere (including inside of the brackets). Try this: `(?<=^|,)[a-z]+(\[[^\[\]]+\])*` – JDB Sep 06 '12 at 17:47