2

C# Regex Split - commas outside quotes

var result = Regex.Split(samplestring, ",(?=(?:[^\"]*\"[^\"]*')*[^\"]*$)");

I have problems to understand how it works.

Specifically, I don't know what the * matches here?

",(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)")
                     ^

Does it mean

there are 0 or more of (?=(?:[^\"]*\"[^\"]*')

update for a sample input

2,1016,7/31/2008 14:22,Geoff Dalgas,6/5/2011 22:21,http://stackoverflow.com,"Corvallis, OR",7679,351,81,b437f461b3fd27387c5d8ab47a293d35,34

Use the following code to test:

string samplestring = "2,1016,7/31/2008 14:22,Geoff Dalgas,6/5/2011 22:21,http://stackoverflow.com,\"Corvallis, OR\",7679,351,81,b437f461b3fd27387c5d8ab47a293d35,34";
Community
  • 1
  • 1
q0987
  • 34,938
  • 69
  • 242
  • 387
  • Your sample data has double-quotes and your regex matches single-quotes. Is that correct? – Greg Jul 01 '11 at 04:53

2 Answers2

5

It means that the group (?:[^']*'[^']*') is matched zero or more times.

,       // match one comma
(?=     // Start a positive lookAHEAD assertion
(?:     // Start a non-capturing group
[^']*   // Match everything but a single-quote zero or more times
'       // Match one single-quote
[^']*   // Match everything but a single-quote zero or more times
'       // Match one single-quote
)*      // End group and match it zero or more times
[^']*   // Match everything but a single-quote zero or more times
$)      // end lookAHEAD
Greg
  • 23,155
  • 11
  • 57
  • 79
  • @Greg, based on your comments, why the following doesn't work ",(?=([^']*'[^']*')*[^']*$)") why we have to use a passive group? – q0987 Jul 01 '11 at 04:30
  • @q0987 - What is your input? I ran some quick tests and it seemed to work the same as both a passive group and a regular group. – Greg Jul 01 '11 at 04:45
  • @Greg, I have added a sample data row. It only works for me if we use a passive group – q0987 Jul 01 '11 at 04:47
  • @q0987 - I used http://gskinner.com/RegExr/ to test against your sample data. Both capturing and passive groups had the same result. Maybe .NET treats them differently than flash. Or maybe I'm missing something. – Greg Jul 01 '11 at 04:56
0

You could check your regex and make your test on this website :

http://www.annuaire-info.com/outil-referencement/expression-reguliere/

;) have fun

Thiib
  • 39
  • 2