2

my first post here. Well, i'm building a simple app for messaging through console(cmd and terminal), just for learning, but i'm got a problem while reader and writing the text with a charset.

Here is my initial code for sending message, the Main.CHARSET was setted to UTF-8:

Scanner teclado = new Scanner(System.in,Main.CHARSET);
BufferedWriter saida = new BufferedWriter(new OutputStreamWriter(new BufferedOutputStream(cliente.getOutputStream()),Main.CHARSET)));
saida.write(nick + " conectado!");
saida.flush();
while (teclado.hasNextLine()) {
    saida.write(nick +": "+ s);
    saida.flush();
}

And the receiving code:

try (BufferedReader br = new BufferedReader(new InputStreamReader(servidor,Main.CHARSET))){
    String s;
    while ((s = br.readLine()) != null) {
        System.out.println(s);
    }
}

When i send "olá" or anything like "ÁàçÇõÉ" (Brazilian portuguese), i got just blank spaces on windows cmd (not tested in linux).

So i teste the following code:

Scanner s = new Scanner(System.in,Main.CHARSET);
System.out.println(s.nextLine());

And for input "olá", printed "ol ".

the question is, how to read the console so that the input is read correctly , and can be transmitted to another user and be displayed correctly to him.

ehbarbian
  • 73
  • 1
  • 6
  • Are you sure the windows console can actually display UTF-8 (or defaults to UTF-8). You can check by simply writing the output to a file instead of to screen and then looking at the file with a UTF8 capable editor. – pvg Jan 06 '16 at 00:41
  • I put the output to a txt file, with the OutputStreamWriter with the UTF-8 charset, and the output was: ol� – ehbarbian Jan 06 '16 at 01:10

1 Answers1

0

if you just wanna output portuguese in text file, it would be easy.

The only thing you have to care about is display by UTF-8 encoding.

you can use a really simple way like

    String text = "olá";
    FileWriter fw = new FileWriter("hello.txt");
    fw.write(text);
    fw.close();

Then open hello.txt by notepad or any text tool that support UTF-8

or you have to change your tool's default font into UTF-8.


If you want show it on console, I think pvg already answer you.


OK, seems you still get confuse on it.

here is a simple code you can try.

        Scanner userInput = new Scanner(System.in);//type olá plz
        String text = userInput.next();
        System.out.println((int)text.charAt(2));//you will see output int is 63        

        char word = 'á'; // this word covert to int is 225
        int a = 225;
        System.out.println((int)word);// output 225
        System.out.println((char)a);  // output  á

So, what is the conclusion?

If you use console to tpye in portuguese then catch it, you totally get different word, not a gibberish word.

TomN
  • 574
  • 3
  • 18
  • works fine for a hard-coded string, but if i read the text from System.in, doesn't work. The output is just "ol?" or something like "ol�" – ehbarbian Jan 06 '16 at 02:10
  • 1
    OK, check answer again, make sure you get the problem why it always goes wrong. – TomN Jan 06 '16 at 04:08
  • By the way, ascii number 63 is question mark, so it would be '?'. – TomN Jan 06 '16 at 04:11
  • ok I got it. I had the same error here , but instead of 63 I got 160 on windows 7 console and 65533 on netbeans console, must be because the both encoding. So is there a way to read correctly? Read as utf -8 from standard input ? Or the fact that the console encoding be another ( cp850 windows 7 ) is impossible? – ehbarbian Jan 06 '16 at 13:59
  • @ehbarbian OK, there's several topic may help you. For netbeans console : http://stackoverflow.com/questions/23709515/netbeans-java-console-encoding-utf-8-and-umlauts or http://stackoverflow.com/questions/23726899/change-console-input-encoding-in-netbeans-8-0 – TomN Jan 07 '16 at 00:38