2

I have a csv file that contains below lines:

23000747,,2015582,-375080.2254,-375080,-375080
23000749,,SA1555,"-30,448,276","-30,448,456","-30,448,239"

I'd like to remove the double quotes and commas from all the quoted columns so that the result will be something like below:

23000747,,2015582,-375080.2254,-375080,-375080
23000749,,SA1555,-30448276,-30448456,-30448239

I have managed to be able to locate the parts on which I want to remove the comma using below command, but I couldn't figure out how to do s/,//g and s/"//g on \1.

sed 's/\("[-,0-9]*"\)/#\1#/g' 1.txt

23000747,,2015582,-375080.2254,-375080,-375080
23000749,,SA1555,#"-30,448,276"#,#"-30,448,456"#,#"-30,448,239"#

Really appreciate if anyone can help here.

Jack

Gilles Quénot
  • 173,512
  • 41
  • 224
  • 223
Jack
  • 45
  • 5

2 Answers2

1

For this specific task, the shell is limited. An advanced text manipulation language like Perl is more suitable with a CSV parser, see :

my $file = "/path/to/file.csv";

use strict; use warnings;

use feature qw/say/;
use Text::CSV;

my $csv = Text::CSV->new()
    or die "Cannot use CSV: ".Text::CSV->error_diag();

open my $fh, "<:encoding(utf8)", $file
    or die "$file: $!";

while (my $row = $csv->getline($fh)) {
    map { tr/,// } @$row;
    say join ",", @$row;
}

$csv->eof or $csv->error_diag();
close $fh;

If you need to remove commas on particular columns, replace

map { tr/,// } @$row;

by

map { tr/,// } @$row[3..5]; # array slice (columns N-1)
Gilles Quénot
  • 173,512
  • 41
  • 224
  • 223
  • yes, perl is definitely an ideal tool for this type of task. However, my environment is solaris 10 and the perl that comes with it doesn't have the Text::CSV or Text::CSV_XS modules. also I don't have control of the server (i.e. manually intalling those modules is not an option). BTW, I also tested the perl code you provided, it works great except that I had to change the "tr/,//" to "s/,//g" before I was able to get it working. – Jack Feb 13 '13 at 15:49
  • Are you sure ? You can install modules without root access. See http://stackoverflow.com/questions/2980297/how-can-i-use-cpan-as-a-non-root-user – Gilles Quénot Feb 13 '13 at 15:50
  • Really? I'll try installing the modules later. Thanks a lot for your input, Sputnick! Stackoverlow is such a great place where I can always find a solution to questions that have been troubling me for hours! – Jack Feb 13 '13 at 17:11
1

sed is not appropiate for your job. You could use Perl and the Text::CSV module, but if you have GNU awk you can use the FPAT variable:

awk 'BEGIN { FPAT = "([^,]*)|(\"[^\"]+\")"; OFS="," } { for (i=1; i<=NF; i++) gsub(/[\",]/,"", $i) }1'

Results:

23000747,,2015582,-375080.2254,-375080,-375080
23000749,,SA1555,-30448276,-30448456,-30448239
Steve
  • 51,466
  • 13
  • 89
  • 103
  • GNU awk is much more powerful than awk/nawk that comes with solaris. although it's not an option for me, still appreciate your answer and I will surely put this into my toolbox for future use. Thanks Steve! – Jack Feb 13 '13 at 17:02