0

In my Android application I create a TSV file and I send it via e-mail. When I try to open it using Excel it detects it as ANSI, so some of my characters aren't seen as they should. I have another application not created by me that does the same, but Excel detects it as UTF-8. I've checked the HEX code of both files and I found that he uses C2 A0 in some of his spaces (not tabs), while I use 20.

This is the code I use to create the file:

        FileOutputStream writeFileHandler = context.openFileOutput(LOG_FILENAME_CSV, Context.MODE_WORLD_READABLE);
        OutputStreamWriter oputStreamWriter = new OutputStreamWriter(writeFileHandler, "UTF-8");

        for(int i=log_rows.length-1; i>=0; i--){
            String[] entries = getDailyLogEntries(log_rows[i]); // array of your values
            for(int j=0; j<entries.length; j++){
                if(j==0)  oputStreamWriter.write(entries[j]);
                else{
                    if(entries[j] != null) oputStreamWriter.write("\t"+entries[j]);
                }
            }
            oputStreamWriter.write("\n");
        }
        oputStreamWriter.flush();
        oputStreamWriter.close();
        writeFileHandler.close();

I tried to pull the file using DDMS to avoid the e-mail sending and the result is the same, it's detected as ANSI. How can I do it so Excel detects it as UTF-8?

Thanks!

PX Developer
  • 8,065
  • 7
  • 42
  • 66
  • "ANSI" may be related? – njzk2 Mar 12 '13 at 13:06
  • Sorry, the right code is using UTF-8, I just used ANSI to check if that way it worked, but Android doesn't have ANSI charset name so I get an exception. I edited the post. – PX Developer Mar 12 '13 at 13:07
  • @user1455909 This may help: http://stackoverflow.com/questions/6002256/is-it-possible-to-force-excel-recognize-utf-8-csv-files-automatically and http://stackoverflow.com/questions/155097/microsoft-excel-mangles-diacritics-in-csv-files – assylias Mar 12 '13 at 13:39

1 Answers1

0

Have you tried adding the UTF-8 BOM at the start of your TSV file? This may persuade Excel that it's a UTF-8 file. The byte sequence is 0xEF,0xBB,0xBF.

See http://en.wikipedia.org/wiki/Byte_order_mark

Jonathan Caryl
  • 1,330
  • 3
  • 12
  • 30
  • I tried it and it's still the same, it's detected as ANSI. I did it using two ways: 1- oputStreamWriter.write((char)0xEF+(char)0xBB+(char)0xBF); AND 2- oputStreamWriter.write((char)0xEF); oputStreamWriter.write((char)0xBB); oputStreamWriter.write((char)0xBF); Using the first way I see those chars like this in Wordpad: É©. Using the second one I see them like this in Wordpad: . If I set UTF-8 in Excel with the second way I see the chars , like it's said in your link. None of them work, Excel still says it's ANSI :( – PX Developer Mar 13 '13 at 08:28
  • In Java a char is 16-bit value (see http://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html) so you'll we writing the wrong data — six bytes rather than three. – Jonathan Caryl Mar 13 '13 at 10:53
  • I just tried with byte type and Excel still ignores it. In the links assylias provided in his/her comment it says Excel ignores BOM and they recommend creating an Excel file. Excel sucks -.- Thank you anyway! :) – PX Developer Mar 13 '13 at 11:04