1

when i compile the following code, i am not getting any compile time error . where as the out put for the code is displayed as "?????".

i have tried compiling the code as javac hinditest.java

is there any way i can get output in the language that i have entered (hindi)

public class hinditest{
    public static void main(String args[])
    {
    String tst = "पाततद";
    System.out.print(tst);
    }
    }

thanks in advance ..

GVR
  • 320
  • 1
  • 3
  • 15

5 Answers5

2

? denotes that the character is not recognized. This happens when the charset used doesn't support the character. Please check whether encoding is UTF-8. You can open the terminal with screen -U and execute your code.

Keerthivasan
  • 12,760
  • 2
  • 32
  • 53
2

you can try this ;

public static void main(String[] args) {
    System.setProperty("file.encoding", "UTF-8");
    String tst = "पाततद";
    System.out.print(tst);
}

and if you are using eclipse than you can set as Run Configuration -> Common -> Encoding -> Select UTF-8

Shekhar Khairnar
  • 2,643
  • 3
  • 26
  • 44
  • @Shekhar +1 but I will recommend to use `UTF-16`, it will include all the characters that are possible. – Vishrant Mar 28 '14 at 12:08
1

Please see the stack over flow link below for encoding your output. Java: How to detect (and change?) encoding of System.console?

Community
  • 1
  • 1
mpop
  • 499
  • 11
  • 21
0
String tst = "पाततद";
byte[] array = tst.getBytes("UTF-8");
String s = new String(array, Charset.forName("UTF-8"));
System.out.println(s);

The String constructor cannot distinguish the charset that is being used and will try to convert it using the system standard which is generally something like ASCII or ISO-8859-1. This is why normal A-Za-z looks proper but then everything else begins to fail.

Byte is a type that runs from -127 to 127 thus for UTF-8 conversion consecutive bytes need to be concatenated. It's impossible for the String constructor to distinguish this off a byte array so it will handle each byte individually by default (thus why basic alphanumeric will always work as they fall into this range).

darijan
  • 9,725
  • 25
  • 38
  • This is ridiculous. A String is a String, it has no encoding---or if you want, it has always the same encoding, defined by the JLS. – Marko Topolnik Mar 28 '14 at 12:08
  • A book is a book and a plane is a plane. What's your point? And no, actually a String is not a String but rather an array of chars. If you still don't believe me rethink about refreshing your Java knowledge. Also, try putting different encoding into parentheses (e.g. ISO-8859-1) and you will see the output on the console MAGICALLY change! – darijan Mar 28 '14 at 12:11
  • Of course, because you have destroyed the original string. – Marko Topolnik Mar 28 '14 at 12:16
  • BTW "a String is an array of chars"---and what is a char? How do you change the encoding of a char? Answer: *you can't* because that's not how Java works. – Marko Topolnik Mar 28 '14 at 12:16
  • Please, please study this method `java.lang.StringCoding.decode(Charset cs, byte[] ba, int off, int len);`. Ziv bio. – darijan Mar 28 '14 at 12:18
  • Ziv bio ti meni 100 godina :) but your answer still doesn't make sense. You can't "fix" a string by dumping into a byte array then reading back that array into another string. At best you end up with the same string, otherwise you get mojibake. – Marko Topolnik Mar 28 '14 at 12:20
0

You can specify the encoding property in command line as follows:

java -Dfile.encoding=UTF-8 hinditest
r3ap3r
  • 2,775
  • 2
  • 17
  • 16