1

I have utf-8 file which I want to read and display in my java program.

In eclipse console(stdout) or in swing I'm getting question marks instead of correct characters.

BufferedReader fr = new BufferedReader(
                      new InputStreamReader(
                      new FileInputStream(f),"UTF-8"));
System.out.println(fr.readLine());

inpuStreamReader.getEncoding() //prints me UTF-8

I generally don't have problem displaying accented letters either on the linux console or firefox etc.

Why is that so? I'm ill from this :/

thank you for help

Can Berk Güder
  • 109,922
  • 25
  • 130
  • 137
feiroox
  • 3,069
  • 8
  • 31
  • 31

3 Answers3

2

I'm not a Java expert, but it seems like you're creating a UTF-8 InputStreamReader with a file that's not necessarily UTF-8.

See also: Java : How to determine the correct charset encoding of a stream

Community
  • 1
  • 1
Can Berk Güder
  • 109,922
  • 25
  • 130
  • 137
  • when I change encoding in firefox or vim to utf-8, I can see accented letters. I also tried to save file as utf-8 in vim/kwrite. – feiroox Mar 30 '09 at 01:59
  • If the file was initially saved in another encoding, saving it as UTF-8 using a text editor usually doesn't convert it to UTF-8. You need to use something like iconv. Can't comment on characters displaying correctly using Firefox/vim though. – Can Berk Güder Mar 30 '09 at 02:03
0

It sounds like the Eclipse console is not processing UTF-8 characters, and/or the font configured for that console does not support the Unicode characters you are trying to display.

You might be able to get this to work if you configure Eclipse to expect UTF-8 characters, and also make sure that the font in use can display those Unicode characters that are encoded in your file.

From the Eclipse 3.1 New and Noteworthy page:

You can configure the console to display output using a character encoding different from the default using the Console Encoding settings on the Common tab of a launch configuration.

As for Swing, I think you're going to need to select the right font.

Jared Oberhaus
  • 14,547
  • 4
  • 56
  • 55
0

There are several parameters at work, when the system has to display Unicode characters -

  • The first and foremost that comes to the mind, is the encoding of the input stream or buffer, which you've already figured out.
  • The next one in the list is the Unicode capabilities of the application - Eclipse does support display of Unicode characters in the console output; with a workaround :).
  • The last one in my mind is that of the font used in you console output - not all fonts come with glyphs for displaying Unicode characters.

Update

The non-display of Unicode characters is most likely due to the fact that Cp1252 is used for encoding characters in the console output. This can be modified by visiting the Run configuration of the application - it appears in the Common tab of the run-time configuration.

Vineet Reynolds
  • 76,006
  • 17
  • 150
  • 174
  • I probably don't need to have any workaround because this is working System.out.println("\u202D\u0645\u0631\u0648\u0629"). – feiroox Mar 30 '09 at 02:05
  • You could find some more tips at this link: http://paranoid-engineering.blogspot.com/2008/05/getting-unicode-output-in-eclipse.html – Vineet Reynolds Mar 30 '09 at 02:12