1

I have the below script which is converting xlsx to csv but if the cell value has comma (,) in between, it is moving to next column in csv which is wrong. Colud you please correct it? Also, how can convert multiple xlsx files to mulptiple csv files in one go?

#!/usr/bin/perl

use strict;
use warnings;
use Spreadsheet::XLSX;
use Text::CSV qw(csv);

my $excel = Spreadsheet::XLSX -> new ('/path/file.xlsx');
my $csv = '/path/File.csv';
open(my $FH ,'>',"$csv") or die "failed to open";

my $line;
foreach my $sheet (@{$excel -> {Worksheet}}) {
    printf("Sheet: %s\n", $sheet->{Name});
    $sheet -> {MaxRow} ||= $sheet -> {MinRow};
    foreach my $row ($sheet -> {MinRow} .. $sheet -> {MaxRow}) {
        $sheet -> {MaxCol} ||= $sheet -> {MinCol};
        foreach my $col ($sheet -> {MinCol} ..  $sheet -> {MaxCol}) {
            my $cell = $sheet -> {Cells} [$row] [$col];
            #if ($cell) {
            #    $line .= "\"".$cell -> {Val}."\",";
                        #       $line .= $cell -> {Val};
                        #       if ($col != $sheet -> {MaxCol}) #appends the comma only if the column being processed is not the last
                        #       {
                        #               $line .= ",";
                        #       }
            #}
                        if (defined $cell && defined $cell->Value) {
               if ($col != $sheet -> {MaxCol})
               {
               print $FH $cell->Value.",";
              }
            else
             {
            print $FH $cell->Value ;
             }
          } else {
            if ($col != $sheet -> {MaxCol})
               { print $FH ",";
               }
             }

        }
$FH =~ s/,$//; # replace comma at the end of the string with empt
       print $FH "\n";
      }
  • still it is going to next column. Is there any other way to restrict the field with comma(,) in between to one field after conversion from xlsx to csv? – john roberto Jun 19 '20 at 09:40
  • check for the cell value contain ',' if ',' present add double quotes. write method and pass $cell->value to check string contain ',' `sub check_cell_string { my ($string) = @_; my $substr = ','; if (index($string, $substr) != -1) { $string = '"'.$string.'"';} return $string; }` and before print $FH statement call this method `my $str = check_cell_string($cell->value) ; print $FH $str.","; OR print $FH $str;` – amit bhosale Jun 19 '20 at 09:47
  • for example in csv file entry look like this `1, 1928,44,Emil Jannings,"The Last Command, The Way of All Flesh"` – amit bhosale Jun 19 '20 at 09:53
  • @briandfoy I have explained here .. I have deleted the script – amit bhosale Jun 19 '20 at 11:51
  • 1
    @amit Your comment as it is could be an answer. If you have a lot of code in a comment, you're probably doing it wrong. – brian d foy Jun 19 '20 at 12:25
  • Hi Brian, how can we master in perl..? Any tips, please? – john roberto Jun 20 '20 at 02:36
  • @briandfoy as per your suggestion, I have added the answer. Thanks. – amit bhosale Jun 23 '20 at 07:37
  • @johnroberto - I have many books that will help you master Perl. One's even called [Mastering Perl](https://www.masteringperl.org). – brian d foy Jun 23 '20 at 14:54
  • Thank you, Brian...Looks like you are a Big Guy..!! :-) – john roberto Jun 24 '20 at 05:39

2 Answers2

3

Check for the cell value contain ',' char. if ',' char present in sting add double quotes to string. Write a method and pass $cell->value to check string contain char ','.

for example

sub check_cell_string {     
    my ($string) = @_;     
    my $substr = ',';     
    if (index($string, $substr) != -1) { 
        $string = '"'.$string.'"';
    }     
    return $string; 
} 

and then call file write statement.

my $str = check_cell_string($cell->value);
print $FH $str;

for example, in the csv file entry look like below

1, 1928,44,Emil Jannings,"The Last Command, The Way of All Flesh"
amit bhosale
  • 482
  • 4
  • 9
  • Amit, another question for converting xlsx data to csv. In xlsx file, the data is like : Bob & Mathew but after conversion , the data in csv file is like: Bob &amp: Mathew. How to handle this? any idea? – john roberto Jun 29 '20 at 18:01
  • If you don't decode UTF-8 text/octets, Perl will assume they are encoded with N8CS (often ISO-8859-1/Latin-1). This means that the individual octets of a multi-byte UTF-8 character are seen as separate characters (not good). If these separate characters are later encoded to UTF-8 for output, a "double encoding" results. This is similar to HTML double encoding — e.g., > instead of >. – amit bhosale Jun 30 '20 at 08:19
  • I am not aware of your current code but check this url https://stackoverflow.com/questions/21292229/opening-utf8-files-on-perl-and-double-encoding `utf8::decode($str); # Fix double-encoding` before writing to file decode string and url https://stackoverflow.com/questions/21096900/how-do-i-avoid-double-utf-8-encoding-in-xmllibxml – amit bhosale Jun 30 '20 at 09:48
1

Regarding the multiple files question, you should be able to do something like this:

my @csv = ('/path/File.csv', 'secondfile', 'thirdfile');

foreach (@csv)
{
    my $excel = Spreadsheet::XLSX -> new ($_.".xslx");
    ...
}
Snorik
  • 201
  • 3
  • 15