1

I am trying to replace all commas in a string that between quotes in notepad++.

I almost got there with regex to remove comma between double quotes notepad++, but not quite. As it stands it is only replacing the first comma.

To summarize this post it uses Find What: ("[^",]+),([^"]+") Replace: \1\2 (which I changed to \1~\2 )

Is there a way in regex to catch all instances of the comma between the quotes?

edit: Add a few representative strings:

1,G 32,170696,01/06/2015,Jun-17,"M12 X 1,50 - 4H GO SRG",P U7,,,,SRG ,"G 32_170696_06-2017_M12 X 1,50 - 4H GO SRG_P U7.pdf"
3,13247,163090,01/11/2015,Nov-17,"PG 0,251 to 0,500 inch",P U7,,,,,"13247_163090_11-2017_PPG 0,251 to 0,500 inch_P U7.pdf"
9,PI 1496,182411,01/04/2015,Apr-17,"6,000 - 6,018mm GO-NOGO  PPG",,,,,PPG,"PI 1496_182411_04-2017_6,000 - 6,018mm GO-NOGO  PPG.pdf"
Community
  • 1
  • 1
CoderWolf
  • 149
  • 1
  • 6

1 Answers1

1

You can do it in one pass using this pattern:

(?:\G(?!^)|([^"]*(?:"[^,"]*"[^"]*)*"))[^",]*\K,([^",]*+(?:"(?1)|$))?

with this replacement: ~\2

demo

details:

(?:
    \G(?!^) # contiguous to a previous match
  | # OR
    ([^"]*(?:"[^,"]*"[^"]*)*") # capture group 1: first match
                               # reach the first quoted part with commas
)
[^",]* \K ,  #"#
( # capture group 2: succeeds after the last comma
    [^",]*+  #"#
    (?:
        " (?1) #"# reach the next quoted part with commas
               # (using the capture group 1 subpattern)
      | # OR
        $ # end of the string: (this set a default behavior: when
          # the closing quote is missing)
    )
)?
Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125
  • Wow, nice bit of regex. It seems to strip out commas after the last " though. At least for me. I struggled to follow it a little so haven't managed to fix it myself. I tested on this data: 123|"Test, Test, Test",456,789,"One, Two" 246|"Test, Test, Test",456,789,One – Futile32 Jan 21 '17 at 01:14