0

I have the regular expression

((?<=\")[^\"]*(?=\"(,|$)+)|(?<=,|^)[^,\"]*(?=,|$))

and the string I try to parse

"Private",N,,"Gas,Meter."

this works properly as I will have Private, N, space, "Gas Meter"

but

"Private",N,,",Gas,Meter."

will give me

Private, ",N,," and then ",Gas,Meter."

the regular expression works if , is in between but not right after the quote.

Any idea?

ctwheels
  • 21,901
  • 9
  • 42
  • 77
Jason
  • 81
  • 6
  • 2
    Seems like you're trying to parse a CSV file/string. What programming language are you using? Most languages have a CSV parser in their standard library. Is there a specific reason you're trying to do this with regex? – 3limin4t0r Nov 27 '19 at 16:25
  • 1
    It's unclear how you are hoping for this wreck of a regex to work. *Probably* you would be better off simply with `(?:"([^"]*)"|([^",]*))` but we can't guess which regex dialect you are using or what you are hoping to achieve. See also the [Stack Overflow `regex` tag info page](/tags/regex/info) for guidance on how to ask a proper question and how to tag it correctly. – tripleee Nov 27 '19 at 16:29
  • Could you use a CSV parser instead Jason? I wonder if that would be more robust. – halfer Nov 27 '19 at 16:30
  • If I find one that is not using visual basic dll, I will definitely use it. – Jason Nov 27 '19 at 16:32
  • What language/environment are you using? – halfer Nov 27 '19 at 16:34
  • C# .net , I don't think they have one built in, the one I know is using visual basic dll. On the other hand, there is some in GitHub, just wondering which one you guys recommend. – Jason Nov 27 '19 at 16:37
  • Possible duplicate of [C#, regular expressions : how to parse comma-separated values, where some values might be quoted strings themselves containing commas](https://stackoverflow.com/questions/1189416/c-regular-expressions-how-to-parse-comma-separated-values-where-some-values) – ctwheels Nov 27 '19 at 16:41

1 Answers1

0

The general issue with the regular expression is that it checks optionally for a comma at the beginning and then greedily goes until it gets a match of ", or ,. I believe it is checking for ", first so it keeps going until it encounters the ",Gas...". I think switching the two conditions though would instead cause issues with commas within quotes. You should probably just replace this regular expression with the on in this answer.

https://stackoverflow.com/a/42535295/2710988

Brandon Barkley
  • 720
  • 6
  • 21