0

I have been struggling for some time with the excel problem.I have Russian characters in my excel file (2003 version) and I want to save it to a csv.But the problem is whenever I do it instead of Russian characters I am getting ??????. After going through various forums I found out that the problem is with MS excel.

So Now I was wondering,if I could write a VB script which would read the strings from the required cells, covert them to UTF-8 encoding and store it in a text file. I have 2 problems here:

  1. Is my solution feasible or not? I have no idea if it can be done or not? Any pointers would be help full. I have trawled through net and couldn't find anything.

  2. As I am completely zero with VB scripting, can some body help me with a sample code please?

Most importantly, if someone knows a better way of doing it please let me know.

Thanks

Salman A
  • 262,204
  • 82
  • 430
  • 521
spanky
  • 133
  • 1
  • 3
  • 11
  • Related: http://stackoverflow.com/questions/4221176/excel-to-csv-with-utf8-encoding – Tomalak May 14 '12 at 06:38
  • Do you really want a vb-script solution? What if excel save-as command could do the trick? – Salman A May 14 '12 at 06:45
  • @Salman I'm pretty sure he means VBA. Also, as you can see from the other question, "Save As" does not do the trick. – Tomalak May 14 '12 at 06:58
  • @salman: yeah I need to have a VB script as there are lot of strings. Tomalak is right, save-as does not do the trick. – spanky May 14 '12 at 08:21

2 Answers2

3

Here is a little vbscript that uses ADO to (i) read an excel file (ii) write to CSV file:

option explicit

' Reading Excel Files:
' http://support.microsoft.com/kb/257819

dim CONNECTION1
set CONNECTION1 = WScript.CreateObject("ADODB.CONNECTION")
CONNECTION1.Open "Provider=MICROSOFT.JET.OLEDB.4.0;Data Source=C:\Users\Salman\Desktop\input.xls;Extended Properties=""Excel 8.0;HDR=Yes;"""

dim RECORDSET1
set RECORDSET1 = WScript.CreateObject("ADODB.RECORDSET")
RECORDSET1.open "SELECT * FROM [Sheet1$]", CONNECTION1, 3, 1

' Notes:
' CharacterSet 65001 is UTF-8
' add/remove columns and change datatype to match you excel file

dim CONNECTION2
set CONNECTION2 = WScript.CreateObject("ADODB.CONNECTION")
CONNECTION2.Open "Provider=MICROSOFT.JET.OLEDB.4.0;Data Source=C:\Users\Salman\Desktop\;Extended Properties=""text;HDR=Yes;FMT=Delimited;CharacterSet=65001;"""
CONNECTION2.Execute "CREATE TABLE [output.csv] ([English] VARCHAR(200), [Swedish] VARCHAR(200), [Russian] VARCHAR(200), [Chinese Simplified] VARCHAR(200))"

dim RECORDSET2
set RECORDSET2 = WScript.CreateObject("ADODB.RECORDSET")
RECORDSET2.Open "SELECT * FROM [output.csv]", CONNECTION2, 2, 2

do until RECORDSET1.EOF
    RECORDSET2.AddNew
    dim i
    for i = 0 to RECORDSET1.Fields.Count - 1
        WScript.Echo RECORDSET1.Fields(i).Value
        RECORDSET2.Fields(i).Value = RECORDSET1.Fields(i).Value
    next
    RECORDSET2.Update
    RECORDSET1.MoveNext
loop

Example Excel File Content (Excel 2003):

English           Swedish            Russian              Chinese Simplified
this should work  Detta bör fungera  это должно работать  这应该工作
this should work  Detta bör fungera  это должно работать  这应该工作
this should work  Detta bör fungera  это должно работать  这应该工作
this should work  Detta bör fungera  это должно работать  这应该工作

Example CSV File Content (UTF-8 without BOM):

"English","Swedish","Russian","Chinese Simplified"
"this should work","Detta bör fungera","это должно работать","这应该工作"
"this should work","Detta bör fungera","это должно работать","这应该工作"
"this should work","Detta bör fungera","это должно работать","这应该工作"
"this should work","Detta bör fungera","это должно работать","这应该工作"

Run this script from command line using cscript:

C:\>cscript export.vbs

Note: you might encounter the following ADO error on 64-bit Windows:

ADODB.Connection: Provider cannot be found. It may not be properly installed. (Error code 0x800A0E7A)

The workaround for this problem is to run the script in 32-bit mode. You can do this by launching the 32-bit command prompt by entering this command in the Run dialog:

%WINDIR%\SysWOW64\cmd.exe

Execute the cscript command from this command prompt.

Salman A
  • 262,204
  • 82
  • 430
  • 521
  • Thanks a lot buddy!!! I will try this script out and see if this works or not. As i told u am complete zero in VBA, will take some time to work it out. – spanky May 15 '12 at 06:05
  • @spanky: if you have problems running the code, please post them as comment (not as answer). – Salman A May 16 '12 at 17:10
  • I am still facing problems while running this code. while running the following line : RECORDSET2.Open CONNECTION2, adOpenKeyset, adLockOptimistic it is giving an run time error"Cannot start your application.The workgroup file is missing or opened exclusively by another user". I am pretty much stumped by this. Any ideas. to me it looks like probs with data source. Note – spanky May 17 '12 at 03:57
  • I have no idea. I tried searching on Google but all results were related to MS-ACCESS (which I suppose you're not using). The script runs fine on my system unless I try to write output files in folders that need administrator privileges. – Salman A May 17 '12 at 07:17
-1

Open Office can do this! Try to load your Excel-file into Open Office and "save as" csv file. Then you will be asked to enter the encoding of the exported file. Select "Unicode (UTF-8)".

  • 1
    You may have noticed this is a *programming* website. The user is asking about doing this in a program he is creating; not with another program. – Andrew Barber Sep 26 '12 at 08:18