sed: Replace double quote won't work in Terminal

Question

I have csv files with newlines within fields. Now I would like to remove them without removing the newline at the end of the row.

The end of the rows have a closing double quote like so:

...;"25.33"\n

So in order to remove the newlines within the fields I try to remove every newline that is not preceded by a double quote. The regular expression for that would be: [^"]\n

And in sed:

sed -i -E "s/[^"]\n/ /g" *.csv # a newline not following a double quote

I get a complaint in bash:

➜ sed -i -E "s/[^"]\n/ /g" *.csv
dquote>

Obviously I have to escape the quote within the brackets:

sed -i -E "s/[^\"]\n/ /g" *.csv

But that won't work either:

➜  csv_working_copy1 sed -i -E "s/[^\"]\n/ /g" *.csv
sed: RE error: illegal byte sequence

What am I missing?

Example

This is an example row

"2019-03-17";"Comment \n
with newline within it";"23.88"\n

I would like to have this output

"2019-03-17";"Comment with newline within it";"23.88"\n

Please add sample input and your desired output for that sample input to your question. — Cyrus, Mar 10 '19 at 08:56
@Ugur, Could you please add more sample lines, can there be more than 2 lines which needed to be in single line? Kindly confirm once. — RavinderSingh13, Mar 10 '19 at 09:40

monok · Accepted Answer · 2019-03-10T17:20:15.930

0

Use the single quote for the outermost double quote:

sed -i -E 's/[^"]\n/ /g' *.csv

edited Mar 10 '19 at 17:20

answered Mar 10 '19 at 09:00

monok

494
5
16

Tried that as well, but I get the same error: `➜ sed -i -E 's/[^\"]\n/ /g' *.csv sed: RE error: illegal byte sequence` Maybe I need to add that I use a Mac + iTerm? – Ugur Mar 10 '19 at 09:07
Don't escape the " with \", please see my example. – monok Mar 10 '19 at 09:12
I am sorry. You are right. But still the same error :( `sed -i -E 's/[^"]\n/ /g' *.csv sed: RE error: illegal byte sequence` – Ugur Mar 10 '19 at 09:15
1

Obviously the error resulted from the content of the file. It's a Mac specific error. I should have googled "sed: RE error: illegal byte sequence". This would have led me to https://stackoverflow.com/questions/19242275/re-error-illegal-byte-sequence-on-mac-os-x – Ugur Mar 10 '19 at 11:00
2

No need of -i here, the command do nothing. sed never see \n ! – ctac_ Mar 10 '19 at 11:48

James Brown · Answer 2 · 2019-03-10T11:19:40.633

Here is an awk that should handle it:

$ awk -v RS="^$" '{            # read the whole file in at the beginning
    for(i=1;i<=length;i++) {   # iterate file char at a time
        c=substr($0,i,1)       # read char
        if(c=="\"")            # if its a quote
            f=!f               # ... flag up, of down if already up
        if(c=="\n" && f)       # if its newline and flag is up ie. within quotes
            c=""               # replace newline with null
        printf "%s",c          # print char
    }
}' file

Output with the sample:

"2019-03-17";"Comment \nwith newline within it";"23.88"\n

More records:

$ awk ... file file file
"2019-03-17";"Comment \nwith newline within it";"23.88"\n
"2019-03-17";"Comment \nwith newline within it";"23.88"\n
"2019-03-17";"Comment \nwith newline within it";"23.88"\n

It won't tolerate any quote problems, naturally.

Update: Another shorter solution:

$ awk '{if((c+=gsub(/"/,"&"))%2==0)print;else printf "%s",$0}' file

Explained:

$ awk '{
    if((c+=gsub(/"/,"&"))%2==0)  # keep count of quotes, if count is even:
        print                    # print with newline
    else                         # else
        printf "%s",$0           # omit newline
}'

James, thank you very much. That is a very good solution. But I was wondering how I could do it with `sed` — Ugur, Mar 10 '19 at 11:01

score 0 · Answer 3 · answered Mar 10 '19 at 11:35

0

Another awk :

awk '!($0~"\"$"){a=a$0;next}{$0=a $0;a=""}1' infile

answered Mar 10 '19 at 11:35

ctac_

2,413
2
7
17

sed: Replace double quote won't work in Terminal

Example

3 Answers3