2

I'd like to create an import file wizard on android based on csv file.

I use au.com.bytecode.opencsv but I'm in front of a encoding problem.

Excel have multiple save format and one is "Unicode Document", it seemed to be a good idea because we can't manage the encoding for the other excel format.

But when I use the data, I know how to handle UTF8 without BOM but not UTF-16. A simple strvar.equals("name") doesn't work.

I'd like to handle UTF8 without BOM, with BOM, UTF 16 etc.. How can I handle the encoding nightmare ? I think I have to detect the format and then convert it but I need that my code is robust.

Regards

P. Sohm
  • 2,842
  • 2
  • 44
  • 77
  • 2
    Your "standard" string functions may not work because in UTF16 every character is encoded as at least *two* bytes (where UTF8 is always at least *one* byte). For plain ASCII text, such as your `"name"` string, you get interleaved zero and ASCII, where the order is determined by the file endianness. Step 1: determine file endianness (that needs a BOM), step 2: process 2 bytes at a time. If you are confident handling UTF8, convert it to that. – Jongware Aug 26 '14 at 21:39

1 Answers1

2

I found a solution on https://stackoverflow.com/a/1888284/584448 or https://stackoverflow.com/a/1835529/584448

I didn't check the difference, but the first works in my case.

Community
  • 1
  • 1
P. Sohm
  • 2,842
  • 2
  • 44
  • 77