1

I am converting CSV file from Tatoeba project. It contains Japanese characters. I am inserting data into SQLite database. Insertion is going without a problem, but characters are showing not properly. If I insert directly:

            String str = content_parts[2];
            sentence.setValue(str);

Getting values like this:

ãã¿ã«ã¡ãã£ã¨ãããã®ããã£ã¦ãããã

I have tried to decode to UTF8 from JIS:

            String str = content_parts[2];
            byte[] utf8EncodedBytes = str.getBytes("JIS");
            String s = new String(utf8EncodedBytes, "UTF-8");
            sentence.setValue(s);

JIS:

$B!)!)!)!)!)!)!)!)!)!)!)!)!)!)!)!)!)!r!)!)!/!)!)!)!)!)!)!)!)!)!)!)!)!)!)!)!)!)!)!)!)!r!)!)!)!)!)!)!)!)!)!)!)!)!)!)!)(B

Shift-JIS:

????\??????�N?�}??????????????????��?????�N?�N???��??????

Shift_JIS:

????\????????????????????????��?�N??????????????????��??????

CSV file (when opened by Excel 2010)

n きみにちょっとしたものをもってきたよ。

What I am doing wrong? How to solve this problem?

Filburt
  • 17,626
  • 12
  • 64
  • 115
Joe Rakhimov
  • 4,713
  • 9
  • 51
  • 109

1 Answers1

2

If you are still searching for solution, refer below link

setting-a-utf-8-in-java-and-csv-file and handle Japanese characters

csv-reports-not-displaying-japanese-characters

In brief, add BOM(byte order mark) characters to your file outputstream before passing it to outputstream writer.

String content="some string to write in file(in any language)";

FileOutputStream fos = new FileOutputStream("D:\csvFile.csv");

fos.write(239);

fos.write(187);

fos.write(191);

Writer w = new BufferedWriter(new OutputStreamWriter(fos, StandardCharsets.UTF_8));

w.write(content);

w.close();

Hope this will help

Community
  • 1
  • 1