-2

I've a comma separated log array of mixed fields from which I would like to take out the 9th field ("-"), perhaps escaping the double quotes (so only - then):

Home_TE,-2.8,1,"-",-,-,-,1,"-",-,-,-,"-",1,-,"-","-",-,-,MIL_TT

Does anyone have a pure regex solution for this?

Cœur
  • 37,241
  • 25
  • 195
  • 267
MimmoFu
  • 11
  • 2
  • Which regex engine? What have you tried so far, and what trouble did you run into? Do we have to take into account the possibility that (the first nine) quoted field values themselves may contain commas and/or escaped quote characters? – Ruud Helderman Mar 23 '16 at 14:19
  • Pure RegEx. The 9th would be most probably a number and the previous ones will be either a string or a number but not sure if, for instance, a string is expected at the 4th or 5th field. Since log entry I don't believe that a comma would be part of an expected value. – MimmoFu Mar 23 '16 at 14:28
  • That's what I tried ([^,]*,){9} but for which I can get "-", – MimmoFu Mar 23 '16 at 14:41
  • "Pure regex" is a non-statement. There are [dialects](http://stackoverflow.com/questions/2298007/why-are-there-so-many-different-regular-expression-dialects), and there are differences in how the 'environment' (a programming language? A text editor? The rewrite module of a webserver? The filter module of an e-mail server?) handles capturing subpatterns. Please supply details about the situation you are in, to avoid falling into an [XY problem](http://meta.stackexchange.com/questions/66377/what-is-the-xy-problem). – Ruud Helderman Mar 23 '16 at 15:42

1 Answers1

0

In its simplest form:

^(?:(?:[^,]*,){8})"?([^,"]*)

The capturing subpattern, ([^,"]*), captures the ninth field, stripped from double quotes.

When trying to match multiple lines in one go, you need modifiers m (multiline) and g (global).

Will fail when commas are embedded within any one of the first 9 fields.

Demo: https://regex101.com/r/gM8mO5/1

Ruud Helderman
  • 10,563
  • 1
  • 26
  • 45