1

I have to display text in Hindi (or any regional language) on the browser screens. I will be getting this text from the database.

For this I started at a very basic level with the following:

String escapedStr = "\\u0905\\u092d\\u0940 \\u0938\\u092e\\u092f \\u0939\\u0948 \\u091c\\u0928\\u0924\\u093e";
String hindiText = StringEscapeUtils.unescapeJava(escapedStr);
System.out.println(hindiText);
return hindiText;

I am able to get the Hindi text perfectly fine in the variable hindiText. But when I print it on eclipse console or on the browser screen I get only ???? ?? ??

I set the default character encoding of my browser as well as my eclipse console to UNICODE(UTF-8). But still no success.

Can anyone help me solve this? What setting am I missing?

Just fyi - I am able to open hindi websites in my browser. So language settings is not an issue.

EDIT

As I am using JSP files for my views, I have added the following to my web.xml for setting the character encoding globally. Ref: Followed this

<jsp-config>
    <jsp-property-group>
        <url-pattern>*.jsp</url-pattern>
        <page-encoding>UTF-8</page-encoding>
    </jsp-property-group>
</jsp-config>

But still no success!

DarkKnightFan
  • 1,913
  • 14
  • 42
  • 61
  • If you get `???? ?? ??` both with the literal and database, then you can probably rule out the database. Judging from the fact that you are getting an ASCII question mark per character outside common default charsets, there is probably some configuration that you are missing. – Esailija Feb 08 '13 at 12:18
  • yeah am not worried about the database for now. As mentioned in the code, I am assuming a hard coded string (which I will get from DB) and convert it into a string which i can display on screen. I am also checking what config i have missed. – DarkKnightFan Feb 08 '13 at 12:21
  • See "question mark" in https://stackoverflow.com/questions/38363566/trouble-with-utf8-characters-what-i-see-is-not-what-i-stored – Rick James Feb 02 '21 at 05:40

3 Answers3

9

But when I print it on eclipse console or on the browser screen I get only ???? ?? ??

As to Eclipse part, you need to tell it to use UTF-8 for its stdout console. You can set that by Window > Preferences > General > Workspace > Text File Encoding.

enter image description here

As to the JSP part, you need to tell it to use UTF-8 to write HTTP response body. You can set that by either

<%@page pageEncoding="UTF-8"%>

in every individual JSP, or applicationwide by

<jsp-config>
    <jsp-property-group>
        <url-pattern>*.jsp</url-pattern>
        <page-encoding>UTF-8</page-encoding>
    </jsp-property-group>
</jsp-config>

in web.xml.

See also:

Community
  • 1
  • 1
BalusC
  • 1,082,665
  • 372
  • 3,610
  • 3,555
  • Does the text file encoding affect anything else than ...text file encoding? Because his text file is exactly the same in UTF-8 as it is in any of those encodings except UTF-16 variants. – Esailija Feb 08 '13 at 15:03
  • @Esailija: as said, it also affects the stdout encoding (the stdout is where `System.out` points to). – BalusC Feb 08 '13 at 18:47
  • @BalusC I have tried the above mentioned steps in case of eclispse and JSP both. But I am still getting the same output on both of them. Am I missing something here? – DarkKnightFan Feb 11 '13 at 04:28
  • 1
    Works fine for me. What if you do `String hindiText = "अभी समय है जनता"; System.out.println(hindiText);`? – BalusC Feb 11 '13 at 12:00
0

use your code this way

System.out.println("\u0905\u092d\u0940\u0938\u092e\u092f\u0939\u0948\u091c");
SkyWalker
  • 28,384
  • 14
  • 74
  • 132
scofield
  • 55
  • 1
  • 5
0

The problem is that the output streams you use probably have the "default" encoding of your platform. On windows, this is often some crappy MS 8-bit code page.

Make it a habit to print your text through PrintWriters, and make sure they are created with the correct encoding.

Ingo
  • 36,037
  • 5
  • 53
  • 100