21

Is there a simple way to translate an XLS to a CSV formatted file without starting the Excel windowed application?

I need to process some Excel XLS workbooks with scripts. For this i need to convert the xls file into a csv file. This can be done with a save-as from the Excel application. But, i would like to automate this (so, not open the Excel application window).

It will suffice if the first sheet from the workbook gets translated to the CSV format. I need to just process data in that sheet.

I have Cygwin and Excel installed on my system -- if that helps.

Edit: Ok, i have a working solution with Perl. Updating for future use by others.

I installed the Spreadsheet::ParseExcel module. and then used read-excel.pl sample.

My code is a slight variation of this sample code, as below.

#!/usr/bin/perl -w
# For each tab (worksheet) in a file (workbook),
# spit out columns separated by ",",
# and rows separated by c/r.

use Spreadsheet::ParseExcel;
use strict;

my $filename = shift || "Book1.xls";
my $e = new Spreadsheet::ParseExcel;
my $eBook = $e->Parse($filename);
my $sheets = $eBook->{SheetCount};
my ($eSheet, $sheetName);

foreach my $sheet (0 .. $sheets - 1) {
    $eSheet = $eBook->{Worksheet}[$sheet];
    $sheetName = $eSheet->{Name};
    print "#Worksheet $sheet: $sheetName\n";
    next unless (exists ($eSheet->{MaxRow}) and (exists ($eSheet->{MaxCol})));
    foreach my $row ($eSheet->{MinRow} .. $eSheet->{MaxRow}) {
        foreach my $column ($eSheet->{MinCol} .. $eSheet->{MaxCol}) {
            if (defined $eSheet->{Cells}[$row][$column])
            {
                print $eSheet->{Cells}[$row][$column]->Value . ",";
            } else {
                print ",";
            }
        }
        print "\n";
    }
}

Update: Here is a Powershell script that might also be easy to work with; as-is from this MSDN blog and, SO Reference.

$excel = New-Object -comobject Excel.Application
$workbooks = $excel.Workbooks.Open("C:\test.xlsx")
$worksheets = $workbooks.Worksheets
$worksheet = $worksheets.Item(1)
$range = $worksheet.UsedRange
foreach($row in $range.Rows)
{
    foreach($col in $row.Columns)
    {
        echo $col.Text
    }
}

Update: I recently came across a Windows tool CSVed at this Superuser answer which might be useful to some people.

Community
  • 1
  • 1
nik
  • 13,254
  • 3
  • 41
  • 57
  • 2
    `New-Object -comobject Excel.Application` looks like it will start up the Excel app -- you said you didn't want to do that. – John Machin Jun 26 '09 at 22:35
  • @John, thanks for pointing that out. I liked and used the Perl solution. – nik Jun 27 '09 at 01:43
  • This is a great question. I managed to run the code and it returns the Excel entries separated by commas, as expected. Sorry for the stupid question, but how to actually save the final information in a csv file? – Stefano Lombardi May 28 '14 at 21:52

13 Answers13

6

You can use xls2csv from the catdoc package if you're on Debian/Ubuntu

Amandasaurus
  • 58,203
  • 71
  • 188
  • 248
4

From Gnumeric docs:

Gnumeric can convert files automatically without needing user intervention. This allows a large number of files to be converted using a script. Gnumeric is distributed along with a program called ssconvert which is the program used to convert files automatically. All of the file formats supported by Gnumeric can be used except for the Postscript and PDF file formats which operate through the printing system.

This application is used, from the command line by specifying, any desired options, an input file and an output file. For example,

ssconvert myfile.xls myfile.gnumeric

would convert an Excel format file to a Gnumeric format file.

The available import and export file formats which ssconvert can read can be listed using

ssconvert --list-importers

or

ssconvert --list-exporters

respectively.

Like other GNU command line applications, ssconvert includes a manual page. This page can be accessed by typing:

man ssconvert

which will open the manual page. This page can be navigated by typing the space bar or using the Page Up and Page Down buttons. The man program can be dismissed by typing the q key.

I'm using it and works well.

Marcello Nuccio
  • 3,901
  • 2
  • 28
  • 28
4

In Java world you can use apache poi. You could start from the following Groovy snippet.

FileInputStream fis = new FileInputStream(filename);
Workbook wb = new HSSFWorkbook(fis); 
Sheet sheet = wb.getSheetAt(0);
for (Row row : sheet) {
  for (Cell cell : row) {
    doSomething(cell.toString())
  }

}
jens
  • 1,763
  • 1
  • 15
  • 25
3

Use a perl script. Using the Spreadsheet::ParseExcel perl module from CPAN to parse the xls file followed by output as csv should work fine.

http://search.cpan.org/dist/Spreadsheet-ParseExcel

You could also try using VBScript.

2

Use one of portable [Python] libraries:

pyxlreader.sourceforge.net/

sourceforge.net/projects/pyexcelerator

and make extra script layer on top of it.

Denis Barmenkov
  • 2,276
  • 1
  • 15
  • 20
  • This is probably a good way to go too. I got the Perl solution working for me fast so i stopped on this. – nik Jun 03 '09 at 06:51
2

Excel can be used as datasource and there are drivers available to access EXCEL as database.

1.) Create and Open a connection to EXCEL file, which you want to convert into CSV.

2.) Fire a query like "SELECT * From Sheet1", which will load all the data of Sheet1 into recordset or datatable.

3.) Since I'm using .net, I can hold those records on datatable and convert into CSV using following extension method.

        public static string ToCSV(this DataTable _dataTable)
        {
            StringBuilder csv = new StringBuilder();
            StringWriter sw = new StringWriter(csv);
            int icolcount = _dataTable.Columns.Count;
            for (int i = 0; i < icolcount; i++)
            {
                sw.Write(_dataTable.Columns[i]);
                if (i < icolcount - 1)
                {
                    sw.Write(",");
                }
            }
            sw.Write(sw.NewLine);
            foreach (DataRow drow in _dataTable.Rows)
            {
                for (int i = 0; i < icolcount; i++)
                {
                    if (!Convert.IsDBNull(drow[i]))
                    {
                        sw.Write(drow[i].ToString());
                    }
                    if (i < icolcount - 1)
                    {
                        sw.Write(",");
                    }
                }
                sw.Write(sw.NewLine);
            }
            sw.Close();
            return csv.ToString();
        }

You can apply this approach on the platform you're working on.

Thanks.

this. __curious_geek
  • 42,787
  • 22
  • 113
  • 137
1

VBS script and it works awesome http://www.go4expert.com/forums/showthread.php?t=18188

Set objArgs = WScript.Arguments For I = 0 to objArgs.Count - 1

FullName = objArgs(I)
FileName = Left(objArgs(I), InstrRev(objArgs(I), ".") )

Set objExcel = CreateObject("Excel.application")
set objExcelBook = objExcel.Workbooks.Open(FullName)

objExcel.application.visible=false
objExcel.application.displayalerts=false

objExcelBook.SaveAs FileName & "csv", 23

objExcel.Application.Quit
objExcel.Quit   

Set objExcel = Nothing
set objExcelBook = Nothing

Next

Rasterman
  • 73
  • 1
  • 8
  • This is explored a little further in a duplicate question: http://stackoverflow.com/questions/1858195/convert-xls-to-csv-on-command-line/11252731#11252731 –  Jun 28 '12 at 20:57
0

iam pretty new to these forums, and it would be nice if they put the year on the time stamps(pet peeve) so i know how old the posts are. iam going to assume they are from 2009.

but a good solution in python is to use xlrd to read in your xls files. here is a pretty simple intro tutorial: http://scienceoss.com/read-excel-files-from-python/ it isn't mine.

the only problem i had was excel dates. here is a quick fix for them:

date = xlrd.xldate_as_tuple( int( sheet.cell( rowNum,colNum ).value ),workBookName.datemode )

than create a csv file with the inbuilt csv module,as iam a new user i can only post one hyper link. but google the csv module api.

hope that was helpfull

  • Python xlrd was pointed out by praavDa on the day I posted this question and some days later the author John himself also added a post. – nik Jun 27 '09 at 02:00
0

my solution:

use Spreadsheet::BasicRead;

my $xls = 'file.xls';   
my $csv = 'file.csv';

   my $ss = new Spreadsheet::BasicRead($xls) or die "Could not open '$xls': $!";
   my $name = '';
   my $row = 0;

   open(FILE, ">$csv") or die "Could not open : $!\n";
      flock(FILE, 2) or die "Could not lock file\n"; 

        while (my $data = $ss->getNextRow()){
            $row++;
            $name = join(';',@$data);         
            print FILE $name."\n" if ($name ne "");
        }

      flock(FILE, 8); 
   close FILE; 
Mr.P
  • 1,182
  • 3
  • 20
  • 44
0

You can do it with Alacon - command-line utility for Alasql database.

It works with Node.js, so you need to install Node.js and then Alasql package:

> npm install alasql

To convert Excel file to CVS (ot TSV) you can enter:

> node alacon "SELECT * INTO CSV('mydata.csv', {headers:true}) FROM XLS('mydata.xls', {headers:true})"

Alacon supports other type of conversions (CSV, TSV, TXT, XLSX, XLS) and SQL language constructions (see User Manual for examples).

agershun
  • 4,077
  • 38
  • 41
0

Recommend Convert XLS: http://www.softinterface.com/Convert-XLS%5CConvert-XLS.htm

‘ Convert XLS ’ is a simple to use, yet sophisticated Excel converter utility specifically designed for Excel, text and CSV (character/comma delimited) files. If you need to convert/manipulate one or thousands of files, located in one or many folders this is the tool! Conversions can be done 10-15 times faster without MS Excel.

Automate your Excel Conversion tasks by using:

  • Built in scheduler
  • The command line
  • interface COM interface
William Leara
  • 10,595
  • 4
  • 36
  • 58
0

If You have Cygwin, most probablly You will have Python. If not - install python and use this script. It is much more than You need, but will convert easilly and fast.

praavDa
  • 419
  • 2
  • 9
  • 18
0

@ John Machin: I cant add coment as I newbee for this forum :)

I didn't use old package pyXLreader but make my post with xlrd in mind ;)

I saw it month ago but not used in projects.

WBR

Denis Barmenkov
  • 2,276
  • 1
  • 15
  • 20