Note:
In PowerShell (Core) 7+ the Export-Csv
and ConvertTo-Csv
cmdlets now have -UseQuotes
and -QuoteFields
parameters, which would allow you to create your CSV data without double-quoted fields to begin with; e.g.:
... | Export-Csv out.csv -UseQuotes Never
In Windows PowerShell these cmdlets invariably output double-quoted fields, necessitating manual post-processing, as shown below.
Try the following:
... | % { $_ -replace '(?<=^|",)"|"(?=,"|$)' -replace '""', '"' }
Note:
The assumption is that fields with embedded "
values do not also contain ,
, because the removal of the enclosing double quotes would then break the partitioning of the line into fields.
The extra -replace
operation, -replace '""', '"'
unescapes the field-embedded "
chars.
Note that a well-formed CSV requires "
chars. inside double-quoted fields to be escaped as ""
.
Export-Csv
/ ConvertTo-Csv
do perform this escaping.
However, Import-Csv
/ ConvertFrom-Csv
also tolerate input without this kind of escaping.
Either way, the above command works: if the embedded "
chars. aren't escaped as ""
, -replace '""', '"'
is simply a no-op.
A simple, self-contained example:
@'
"foo","bar"
"baz","qu""ux"
'@ -split '\r?\n' |
% { $_ -replace '(?<=^|",)"|"(?=,"|$)' -replace '""', '"' }
Output (only the field internal "
was retained):
foo,bar
baz,qu"ux
For an explanation of the regex and the ability to experiment with it, see this regex101.com page - note that, because the linked page uses C# string-literal syntax, the "
chars. in the regex are doubled.
Note that you can speed up processing considerably if you load your entire CSV file as a whole into memory and modify your regex with the Multiline
regex option ((?m)
), which then causes ^
and $
to match the start and end of each line:
(Get-Content -Raw your.csv) -replace '(?m)(?<=^|",)"|"(?=,"|$)' -replace '""', '"'