0

I have some csv files which have been incorrectly exported. Some fields containing numbers have been incorrectly exported as Strings with the decimal place as a comma instead of a dot.

Here's a couple of examples:

"GREEN,"15,4",55,"15,5",64,"17,0","18,9",107,145,21,38, "BLUE","15,1",46,"15,2",51,"11,4","13,5",89,96,25,7,

So, I want to switch the commas to dots for numbers in quotation marks and then remove the quotation marks from all numbers.

Can anyone help me out here please?

vks
  • 67,027
  • 10
  • 91
  • 124
SamS
  • 144
  • 1
  • 1
  • 5

2 Answers2

4

Use a CSV parser, and this becomes easy:

#!/usr/bin/perl
use warnings;
use strict;

use Text::CSV_XS;

my $csv = Text::CSV_XS->new( { binary => 1, auto_diag => 1 } );

while ( my $row = $csv->getline( \*DATA ) ) {
   s/,/./ for @$row;
   $csv->say( \*STDOUT, $row );
}


__DATA__
"GREEN","15,4",55,"15,5",64,"17,0","18,9",107,145,21,38,
"BLUE","15,1",46,"15,2",51,"11,4","13,5",89,96,25,7,

Example above uses inlined data to make it runnable standalone - you'll probably want to use an input and output file instead of DATA and STDOUT.

If you're particularly wanting to keep the quotes around the 'strings' but not the 'numbers', then you can make use of keep_meta_info or just re-add them when they're needed.

Sobrique
  • 52,974
  • 7
  • 60
  • 101
2
,(?!(?:[^"]*"[^"]*")*[^"]*$)

You can use this remove , between ".See demo.

https://regex101.com/r/t5Euq1/1

vks
  • 67,027
  • 10
  • 91
  • 124