Find and Replace with Spaces using Sed Mac Terminal

Question

I have a .CSV file with over 500,000 lines that I need to:

find all 'space double quote space' sequences and replace with nothing
find all 'space double quote' sequences and replace with nothing
find all double quotes and replace with nothing

Example of .CSV line:

"DISH Hartford & New Haven  (Hartford)", "206", "FBNHD", " 06028", " East Windsor Hill", "CT", "Hartford County"

** Required output**

DISH Hartford & New Haven  (Hartford),206,FBNHD,06028,East Windsor Hill,CT,Hartford County

I need to remove all double quotes (") and spaces in front of and behind the commas (,).

I've tried

$ cd /Users/Leonna/Downloads/
$ cat bs-B2Bformat.csv | sed s/ " //g

This gives me the 'command incomplete' greater than prompt, so I then tried:

$ cat bs-B2Bformat.csv | sed s/ " //g
sed: 1: "s/": unterminated substitute pattern
$ cat bs-B2Bformat.csv |sed s/ \" //g
sed: 1: "s/": unterminated substitute pattern
$

There are too many lines for me to edit in Excel (Excel won't load all the lines) or even a text editor. How can I fix this?

General comment — enclose `sed` scripts in single quotes. For example: `sed 's/ " //g'`. The only time you don't do that is … perhaps … when the command itself needs to contain single quotes, though then you're usually better off writing each literal single quote as `'\''`, so you might write `sed 's/ '\'' //g'`. — Jonathan Leffler, Sep 17 '13 at 14:53

score 13 · Answer 1 · edited May 23 '17 at 12:09

Quoted from here:

For POSIX compliance, use the character class [[:space:]] instead of \s, since the latter is a GNU sed extension.

Based on that, I would suggest the following, which, as Jonathan Leffler pointed out, is portable across GNU and BSD implementations.

sed -E 's/[[:space:]]?"[[:space:]]?//g' <path/to/file>

The -E flag enables extended regular expressions on BSD implementations. On GNU sed it is undocumented, but as discussed here, it enables compatibility with the BSD standard.

Quoted from the manual for BSD sed:

-E Interpret regular expressions as extended (modern) regular expressions rather than basic regular expressions (BRE's).

Applying the above command on a file containing the following single line

"DISH Hartford & New Haven (Hartford)", "206", "FBNHD", " 06028", " East Windsor Hill", "CT", "Hartford County"

it yields

DISH Hartford & New Haven (Hartford),206,FBNHD,06028,East Windsor Hill,CT,Hartford County

That relies on GNU `sed`. The question mentions Mac. With Mac `sed`, you must specify a suffix for the backup (`-i.bak` for example). Also, Mac `sed` does not interpret EREs by default; you have to enable them with `-E`. The backslashes in front of the question marks aren't necessary, either. So, on Mac, this works: `sed -E 's/[[:space:]]?"[[:space:]]?//g` (with both BSD and GNU `sed` commands) — obviously without overwriting a file but you can add the `-i.bak` if you want it. — Jonathan Leffler, Sep 17 '13 at 14:49
This is absolutely the right answer. \s will not work on OSX. — HashHazard, Mar 03 '15 at 05:15

iamauser · Answer 2 · 2013-09-17T14:48:00.390

1

This works for me. Is this what you want ?

 sed -e 's|", "|,|g' -e 's|^"||g' -e 's|"$||g' file.csv

 echo '"DISH Hartford & New Haven (Hartford)", "206", "FBNHD", " 06028", " East Windsor Hill", "CT", "Hartford County"' | sed -e 's|", "|,|g' -e 's|^"||g' -e 's|"$||g'

 DISH Hartford & New Haven (Hartford),206,FBNHD, 06028, East Windsor Hill,CT,Hartford County

edited Sep 17 '13 at 14:48

answered Sep 17 '13 at 02:25

iamauser

11,119
5
34
52

score 0 · Answer 3 · answered Sep 17 '13 at 02:17

0

This should do it:

sed -i 's/\(\s\|\)"\(\|\s\)//g' bs-B2Bformat.csv

answered Sep 17 '13 at 02:17

Shylo Hana

1,792
13
10

We are on the Mac Track, so there is no `\s`. Minus one. – Timo Jul 07 '22 at 10:18

score 0 · Answer 4 · answered Sep 17 '13 at 21:07

One way is to use python and its csv module:

import csv 
import sys 

## Open file provided as argument.
with open(sys.argv[1], 'r') as f:

    ## Create the csv reader and writer. Avoid to quote fields in output.
    reader = csv.reader(f, skipinitialspace=True)
    writer = csv.writer(sys.stdout, quoting=csv.QUOTE_NONE, escapechar='\\')

    ## Read file line by line, remove leading and trailing white spaces and
    ## print.
    for row in reader:
        row = [field.strip() for field in row]
        writer.writerow(row)

Run it like:

python3 script.py csvfile

That yields:

DISH Hartford & New Haven  (Hartford),206,FBNHD,06028,East Windsor Hill,CT,Hartford County

score 0 · Answer 5 · answered Jun 01 '15 at 05:59

What all of the current answers seemed to miss:

$ cat bs-B2Bformat.csv | sed s/ " //g
sed: 1: "s/": unterminated substitute pattern
$ cat bs-B2Bformat.csv |sed s/ \" //g
sed: 1: "s/": unterminated substitute pattern
$

The problem in the above is missing single quotes. It should have been:

$ cat bs-B2Bformat.csv | sed 's/ " //g'
                             ^        ^

Without the single quotes, bash splits at the spaces and sends three separate arguments (well at least for the case of \"). sed was seeing its first argument as just s/.

Edit: FYI, single quotes are not required, they just make this case easier. If you want to use double quotes, just escape the one you want to keep for matching:

$ cat bs-B2Bformat.csv | sed "s/ \" //g"

I do not understand your answer, the others use single quotes. — Timo, Jul 07 '22 at 10:17

Find and Replace with Spaces using Sed Mac Terminal

5 Answers5