sed - how to replace a string only on the matching part of the lines in a csv file?

Question

I have a csv file that contains below lines:

23000747,,2015582,-375080.2254,-375080,-375080
23000749,,SA1555,"-30,448,276","-30,448,456","-30,448,239"

I'd like to remove the double quotes and commas from all the quoted columns so that the result will be something like below:

23000747,,2015582,-375080.2254,-375080,-375080
23000749,,SA1555,-30448276,-30448456,-30448239

I have managed to be able to locate the parts on which I want to remove the comma using below command, but I couldn't figure out how to do s/,//g and s/"//g on \1.

sed 's/\("[-,0-9]*"\)/#\1#/g' 1.txt

23000747,,2015582,-375080.2254,-375080,-375080
23000749,,SA1555,#"-30,448,276"#,#"-30,448,456"#,#"-30,448,239"#

Really appreciate if anyone can help here.

Jack

Gilles Quénot · Answer 1 · 2013-02-13T09:06:56.763

1

For this specific task, the shell is limited. An advanced text manipulation language like Perl is more suitable with a CSV parser, see :

my $file = "/path/to/file.csv";

use strict; use warnings;

use feature qw/say/;
use Text::CSV;

my $csv = Text::CSV->new()
    or die "Cannot use CSV: ".Text::CSV->error_diag();

open my $fh, "<:encoding(utf8)", $file
    or die "$file: $!";

while (my $row = $csv->getline($fh)) {
    map { tr/,// } @$row;
    say join ",", @$row;
}

$csv->eof or $csv->error_diag();
close $fh;

If you need to remove commas on particular columns, replace

map { tr/,// } @$row;

by

map { tr/,// } @$row[3..5]; # array slice (columns N-1)

edited Feb 13 '13 at 09:06

answered Feb 12 '13 at 21:49

Gilles Quénot

173,512
41
224
223

yes, perl is definitely an ideal tool for this type of task. However, my environment is solaris 10 and the perl that comes with it doesn't have the Text::CSV or Text::CSV_XS modules. also I don't have control of the server (i.e. manually intalling those modules is not an option). BTW, I also tested the perl code you provided, it works great except that I had to change the "tr/,//" to "s/,//g" before I was able to get it working. – Jack Feb 13 '13 at 15:49
Are you sure ? You can install modules without root access. See http://stackoverflow.com/questions/2980297/how-can-i-use-cpan-as-a-non-root-user – Gilles Quénot Feb 13 '13 at 15:50
Really? I'll try installing the modules later. Thanks a lot for your input, Sputnick! Stackoverlow is such a great place where I can always find a solution to questions that have been troubling me for hours! – Jack Feb 13 '13 at 17:11

score 1 · Accepted Answer · answered Feb 13 '13 at 00:51

1

sed is not appropiate for your job. You could use Perl and the Text::CSV module, but if you have GNU awk you can use the FPAT variable:

awk 'BEGIN { FPAT = "([^,]*)|(\"[^\"]+\")"; OFS="," } { for (i=1; i<=NF; i++) gsub(/[\",]/,"", $i) }1'

Results:

23000747,,2015582,-375080.2254,-375080,-375080
23000749,,SA1555,-30448276,-30448456,-30448239

answered Feb 13 '13 at 00:51

Steve

51,466
13
89
103

GNU awk is much more powerful than awk/nawk that comes with solaris. although it's not an option for me, still appreciate your answer and I will surely put this into my toolbox for future use. Thanks Steve! – Jack Feb 13 '13 at 17:02

sed - how to replace a string only on the matching part of the lines in a csv file?

2 Answers2