I have csv files with newlines within fields. Now I would like to remove them without removing the newline at the end of the row.
The end of the rows have a closing double quote like so:
...;"25.33"\n
So in order to remove the newlines within the fields I try to remove every newline that is not preceded by a double quote. The regular expression for that would be: [^"]\n
And in sed
:
sed -i -E "s/[^"]\n/ /g" *.csv
# a newline not following a double quote
I get a complaint in bash:
➜ sed -i -E "s/[^"]\n/ /g" *.csv
dquote>
Obviously I have to escape the quote within the brackets:
sed -i -E "s/[^\"]\n/ /g" *.csv
But that won't work either:
➜ csv_working_copy1 sed -i -E "s/[^\"]\n/ /g" *.csv
sed: RE error: illegal byte sequence
What am I missing?
Example
This is an example row
"2019-03-17";"Comment \n
with newline within it";"23.88"\n
I would like to have this output
"2019-03-17";"Comment with newline within it";"23.88"\n