-1

I'm reading a file which is in unicode using java.

Here is my file:enter image description here

its encoding says unicode. enter image description here

but whenever I try to read this using UTF-8 or UTF-16 in java it reads english words correctly but can not read other words which are in gujarati language. It gives ????? only. I had the same problem which I have asked here. but no one answered, so changed my approach. I read the data from the MS SQL database having column nvarchar(for Gujarati language) and stored it in a file, and now trying to read the data from the file using java. But still not getting it.

I tried changing encoding of my file to UTF-8 and unicode big endian and I've tried all the unicode formats supported in java8 but not getting the desired result.

this is my code of java:

File fileDir = new File("C:\\Users\\admin\\AppData\\Local\\Programs\\Python\\Python35\\data.txt");

BufferedReader in = new BufferedReader(new InputStreamReader(new FileInputStream(fileDir),"UTF-16"));

String str;

while ((str = in.readLine()) != null) {
    System.out.println(str);
}

in.close();

I directed this data to my android application through a socket connection. I'm trying to show it on textview but it is also giving me a "???" signs.. And I've tried showing gujarati language on textview directly with textview.setText("તારુ નામ શુ છે ?") and it shows correctly !!!..

When I try to send a hard coded string from java through a socket connection (String is same as written above) it raises this error:

fileread.java:23: error: unmappable character for encoding Cp1252 ds.writeBytes("α¬ñα¬╛α¬░α½? નα¬╛ᬫ α¬╢α½? છα½ç ?");

f-CJ
  • 4,235
  • 2
  • 30
  • 28
Abhishek Panjabi
  • 439
  • 4
  • 23
  • 3
    How do you check it? Maybe output format (console, logs) doesn't support non-unicode characters. – default locale Oct 19 '16 at 03:07
  • I'm using cmd..@defaultlocale – Abhishek Panjabi Oct 19 '16 at 03:08
  • 1
    Windows command line might need configuration to support unicode output. Check out this question: http://stackoverflow.com/questions/388490/unicode-characters-in-windows-command-line-how – default locale Oct 19 '16 at 03:13
  • You really need to isolate your problem. Jon Skeet [already told you](http://stackoverflow.com/questions/40061310/reading-non-english-from-ms-sql-using-jtds-1-3-0-driver-from-java) how to check the characters. Now you might want to check your console and your socket connection. What happens when you send a hardcoded string there? – default locale Oct 19 '16 at 03:39
  • I've updated question.@defaultlocale – Abhishek Panjabi Oct 19 '16 at 03:43
  • Have you tried to google this error message? I suggest you create an [MCVE](http://stackoverflow.com/help/mcve) that demonstrates your particular problem with console output and remove irrelevant information from your question (files, sockets, android UI, etc.) – default locale Oct 19 '16 at 03:55

1 Answers1

0

Your problem is that you are using System.out.println. That uses platform encoding and it won't have the support.

Try using PrintStream and it should work. You can configure PrintStream with:

PrintStream ps = new PrintStream(System.out, true, "UTF-8");

(I also think UTF-8 should do for you).

Aditya K
  • 487
  • 3
  • 11