4

I have an .xlsx file like this:

sample.xlsx:

Heading     C1      C2,01,02    C3    C4
R1          1       4           7     10
R2          2       5           8     11,1
R3          3       6           9,0   12

I want to convert sample.xlsx file into Output.csv file [pipe separated].

Please note that I don't want any double quotes "C2,01,02".

Output.csv:

Heading|C1|C2,01,02|C3|C4
R1|1|4|7|10
R2|2|5|8|11,1
R3|3|6|9,0|12

I know how to produce Output.csv using manual steps like this:

Goto control panel -> Region and Language -> Additional Settings -> update list separator field with pipe "|".

Open sample.xlsx -> save as -> from the drop down select save as type CSV(Comma delimited)(*.csv).

But I don't want to do this manually. I want to achieve the same Output using command line. For this, I have taken reference from this post: Convert XLS to CSV on command line

Code is:

This csv works perfectly but the only problem is that it produces comma separated csv instead of pipe separated.

if WScript.Arguments.Count < 2 Then
    WScript.Echo "Please specify the source and the destination files. Usage: ExcelToCsv <xls/xlsx source file> <csv destination file>"
    Wscript.Quit
End If
csv_format = 6
Set objFSO = CreateObject("Scripting.FileSystemObject")
src_file = objFSO.GetAbsolutePathName(Wscript.Arguments.Item(0))
dest_file = objFSO.GetAbsolutePathName(WScript.Arguments.Item(1))
Dim oExcel
Set oExcel = CreateObject("Excel.Application")
Dim oBook
Set oBook = oExcel.Workbooks.Open(src_file)
oBook.SaveAs dest_file, csv_format
oBook.Close False
oExcel.Quit

To run the above code:

XlsToCsv.vbs [sourcexlsFile].xls [Output].csv

I tried changing value of csv_format = 6 with many other values like 1,2,3...and so on. but it's not giving pipe separated csv.

Please help.

Thanks in advance.

Community
  • 1
  • 1
Jatin
  • 1,857
  • 4
  • 24
  • 37
  • You are aware that *CSV* is the extension associated with ***C**omma **S**eparated **V**alues*? It's difficult to have a pipe-delimited comma separated values file. – Ken White Sep 04 '16 at 06:46
  • Seems to be difficult to do unless you hack: https://www.experts-exchange.com/questions/23712758/Export-semicolon-delimited-csv-file.html. A python solution to post-process the file would take 3 or 4 lines, though. – Jean-François Fabre Sep 04 '16 at 06:48
  • @Ken White: Okay, so can I get a corresponding Output.txt file? – Jatin Sep 04 '16 at 06:49
  • @Jean-François Fabre: Can you please post your hack solution? I am okay with python script also. All I want is that the task of converting an xlsx to pipe separated csv should be automated. – Jatin Sep 04 '16 at 06:55
  • I'm on it. Just the time to write some python code... – Jean-François Fabre Sep 04 '16 at 07:05

2 Answers2

4

Python solution. Uses python 3.4 and standard modules except for openpyxl:

Install openpyxl:

cd /D C:\python34
scripts\pip install openpyxl

Of course xlsx file must have only 1 sheet. Formulas are not evalulated, that's the main limitation.

And the empty lines are filtered out too.

import openpyxl,csv,sys
if len(sys.argv)<3:
   print("Usage xlsx2csv.py file.xlsx file.csv")
   sys.exit()

i = sys.argv[1]
o = sys.argv[2]


f = open(o,"w",newline='')
cw = csv.writer(f,delimiter='|',quotechar='"')

wb = openpyxl.load_workbook(i)
sheet = wb.active
for r in sheet.rows:
    row = [c.value for c in r]
    if row:
        cw.writerow(row)
f.close()

Usage: xlsx2csv.py file.xlsx file.csv

Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219
1

If you are running your script anyways, better extend it like this:

if WScript.Arguments.Count < 2 Then
    WScript.Echo "Please specify the source and the destination files. Usage: ExcelToCsv <xls/xlsx source file> <csv destination file>"
    Wscript.Quit
End If
Set objFSO = CreateObject("Scripting.FileSystemObject")
src_file = objFSO.GetAbsolutePathName(Wscript.Arguments.Item(0))
dest_file = objFSO.GetAbsolutePathName(WScript.Arguments.Item(1))
Dim oExcel
Set oExcel = CreateObject("Excel.Application")
Dim oBook
Set oBook = oExcel.Workbooks.Open(src_file)
oBook.SaveAs dest_file, 3
oBook.Close False
oExcel.Quit
Set objFile = objFSO.OpenTextFile(dest_file, 1)
strText = objFile.ReadAll
objFile.Close
strNewText = Replace(strText, " ", "|")
Set objFile = objFSO.OpenTextFile(dest_file, 2)
objFile.WriteLine strNewText
objFile.Close
Dirk Reichel
  • 7,989
  • 1
  • 15
  • 31
  • the problem with this kind of solution is if there are actual comas in the cells. Otherwise it's OK. – Jean-François Fabre Sep 04 '16 at 07:36
  • @Jean-FrançoisFabre changed it to tab-separated... this will avoid that problem ;) – Dirk Reichel Sep 04 '16 at 07:39
  • @Dirk Reichel: your script behaving differently. First, I want it pipe separated [your script is TAB seperated]. Second, I dont want any double quotes [Your script is giving double quotes]. – Jatin Sep 04 '16 at 07:55
  • I'll check that again... worked perfect with my test files... just a sec. (Just make sure that `Replace(strText, " ", "|")` the `" "` has a tabulator in it... not a space") – Dirk Reichel Sep 04 '16 at 08:01
  • @Dirk Reichel: I am able to see PIPE after replacing space with tab in the first argument in Replace(strText, "", "|"). Now, only 1 more problem remaining.i.e., double quotes. – Jatin Sep 04 '16 at 08:41
  • @Jatin it looks like Excel will quote text depending on [CSV-rules](https://tools.ietf.org/html/rfc4180)... and it seems, you can't avoid it. You could insert a `strNewText = Replace(strText, """", "")` to replace all `"` but if there are some by default, they also would be deleted (and I assume that this is not wanted) :/ – Dirk Reichel Sep 04 '16 at 08:50