83

I learnt from Google that Internationalization is the process by which I can make my web application to use all languages. I want to understand Unicode for the process of internationalization, so I learnt about Unicode from here and there.

I am able to understand about Unicode that how a charset set in encoded to bytes and again bytes decoded to charset. But I don't know how to move forward further. I want to learn how to compare strings and I need to know how to implement internationalization in my web application. Any Suggestions Please? Please guide me.

My Objective:

My main objective is to develop a Web Application for Translation (English to Arabic & vice versa). I want to follow Internationalization. I wish to run my web Application for translation in all the three browsers namely FF, Chrome, IE. How do I achieve this?

BalusC
  • 1,082,665
  • 372
  • 3,610
  • 3,555
IamIronMAN
  • 1,871
  • 6
  • 22
  • 28

3 Answers3

223

In case of a basic JSP/Servlet webapplication, the basic approach would be using JSTL fmt taglib in combination with resource bundles. Resource bundles contain key-value pairs where the key is a constant which is the same for all languages and the value differs per language. Resource bundles are usually properties files which are loaded by ResourceBundle API. This can however be customized so that you can load the key-value pairs from for example a database.

Here's an example how to internationalize the login form of your webapplication with properties file based resource bundles.


  1. Create the following files and put them in some package, e.g. com.example.i18n (in case of Maven, put them in the package structure inside src/main/resources).

    text.properties (contains key-value pairs in the default language, usually English)

     login.label.username = Username
     login.label.password = Password
     login.button.submit = Sign in
     

    text_nl.properties (contains Dutch (nl) key-value pairs)

     login.label.username = Gebruikersnaam
     login.label.password = Wachtwoord
     login.button.submit = Inloggen
     

    text_es.properties (contains Spanish (es) key-value pairs)

     login.label.username = Nombre de usuario
     login.label.password = Contraseña
     login.button.submit = Acceder
     

    The resource bundle filename should adhere the following pattern name_ll_CC.properties. The _ll part should be the lowercase ISO 693-1 language code. It is optional and only required whenever the _CC part is present. The _CC part should be the uppercase ISO 3166-1 Alpha-2 country code. It is optional and often only used to distinguish between country-specific language dialects, like American English (_en_US) and British English (_en_GB).


  2. If not done yet, install JSTL as per instructions in this answer: How to install JSTL? The absolute uri: http://java.sun.com/jstl/core cannot be resolved.


  3. Create the following example JSP file and put it in web content folder.

    login.jsp

     <%@ page pageEncoding="UTF-8" %>
     <%@ taglib prefix="c" uri="http://java.sun.com/jsp/jstl/core" %>
     <%@ taglib prefix="fmt" uri="http://java.sun.com/jsp/jstl/fmt" %>
     <c:set var="language" value="${not empty param.language ? param.language : not empty language ? language : pageContext.request.locale}" scope="session" />
     <fmt:setLocale value="${language}" />
     <fmt:setBundle basename="com.example.i18n.text" />
     <!DOCTYPE html>
     <html lang="${language}">
         <head>
             <title>JSP/JSTL i18n demo</title>
         </head>
         <body>
             <form>
                 <select id="language" name="language" onchange="submit()">
                     <option value="en" ${language == 'en' ? 'selected' : ''}>English</option>
                     <option value="nl" ${language == 'nl' ? 'selected' : ''}>Nederlands</option>
                     <option value="es" ${language == 'es' ? 'selected' : ''}>Español</option>
                 </select>
             </form>
             <form method="post">
                 <label for="username"><fmt:message key="login.label.username" />:</label>
                 <input type="text" id="username" name="username">
                 <br>
                 <label for="password"><fmt:message key="login.label.password" />:</label>
                 <input type="password" id="password" name="password">
                 <br>
                 <fmt:message key="login.button.submit" var="buttonValue" />
                 <input type="submit" name="submit" value="${buttonValue}">
             </form>
         </body>
     </html>
    

    The <c:set var="language"> manages the current language. If the language was supplied as request parameter (by language dropdown), then it will be set. Else if the language was already previously set in the session, then stick to it instead. Else use the user supplied locale in the request header.

    The <fmt:setLocale> sets the locale for resource bundle. It's important that this line is before the <fmt:setBundle>.

    The <fmt:setBundle> initializes the resource bundle by its base name (that is, the full qualified package name until with the sole name without the _ll_CC specifier).

    The <fmt:message> retrieves the message value by the specified bundle key.

    The <html lang="${language}"> informs the searchbots what language the page is in so that it won't be marked as duplicate content (thus, good for SEO).

    The language dropdown will immediately submit by JavaScript when another language is chosen and the page will be refreshed with the newly chosen language.


You however need to keep in mind that properties files are by default read using ISO-8859-1 character encoding. You would need to escape them by unicode escapes. This can be done using the JDK-supplied native2ascii.exe tool. See also this article section for more detail.

A theoretical alternative would be to supply a bundle with a custom Control to load those files as UTF-8, but that's unfortunately not supported by the basic JSTL fmt taglib. You would need to manage it all yourself with help of a Filter. There are (MVC) frameworks which can handle this in a more transparent manner, like JSF, see also this article.

BalusC
  • 1,082,665
  • 372
  • 3,610
  • 3,555
  • 2
    This nice solution has one issue: the locale taken from the request can be language and country, as in "en_US", which would give , which is invalid HTML. It is necessary to use only the language part "en" from the locale as value for the lang attribute. – Torsten Römer Nov 21 '13 at 20:47
  • Just thought to share that TapiJI (https://code.google.com/a/eclipselabs.org/p/tapiji/) maybe a good option for editing resource bundles/files without worrying about character encoding. – Hisham Apr 26 '14 at 14:25
  • How to set by default a language other than English, for example Chinese? – MichalB Dec 10 '14 at 05:49
  • 1
    The method outlined above for internationalization does not amend the url based on the language displayed. Do you have any suggestions for how to update the url according to the language. I ask because, for indexing, it is recommended that different languages have separate urls: https://support.google.com/webmasters/answer/182192?hl=en&topic=2370587&ctx=topic – theyuv Feb 23 '16 at 15:20
  • @theyuv: just alter `param.language` to be the one from `pathInfo`. This answer is after all just a kickoff example. – BalusC Feb 23 '16 at 15:33
  • Thanks, but I'm not sure I understand. At the moment, I use a session variable to determine the language in which the page is displayed, so there is no evidence of the language that appears in the URL. Are you suggesting that I update all paths in the jsp by incorporating the language in the urls (eg: www.domain.com/register becomes www.domain.com/us/register)? – theyuv Feb 23 '16 at 16:10
  • To elaborate on my previous comment: I meant that I would change all urls on my jsps to incorporate the session scoped jstl language attribute. So instead of `` I would have `` – theyuv Feb 23 '16 at 16:29
  • 1
    If you put your language resource (test.properties and text_en.properties files) files at application/resources root, you can set the fmt:bundle like this: – Bahadir Tasdemir Apr 28 '16 at 08:43
  • 1
    @bahadirT: assuming "test" is a typo, that's correct. The `basename` must represent the base name without file extension. Not structuring it in a package is only a poor practice. – BalusC Apr 28 '16 at 08:44
  • Yes that's a typo :) But you must structure it @ resources directory, because texts are resources for your app. Codes go into the packages ;) – Bahadir Tasdemir Apr 28 '16 at 09:30
  • @bahadirT: resources also supports package structures. It ultimately ends up in classpath too. Not packaging it is bad practice because it's not uniquely identifiable anymore and thus there's an increased risk in conflicts when multiple resources with same base name appear in runtime classpath. – BalusC Apr 28 '16 at 09:32
  • @BalusC yes that's a risk, I just think on the behalf of simplification and system logic: resources --> application resources | codes --> packages. I will read more about the conflicts of resources, thanks. – Bahadir Tasdemir Apr 28 '16 at 09:54
  • @BalusC I understood more clearly when reading twice, I think I didn't know the packaging structure of the resources (thought they were inside the folders where codes are stored). That's good, I was using it as my/seperate/resources but package notation is more appropriate. – Bahadir Tasdemir Apr 28 '16 at 10:17
  • Not working, can you tell me the `com.example.i18n.text` fucntion body ? I used `ResourceBundle res = ResourceBundle.getBundle("ResourceBundle", locale);` in the method to read the I18N properties. Please help to resolve my fault – Prajwal Bhat Oct 06 '16 at 07:23
  • works perfectly. only change is i had to remove the tag from code. – Jakki Oct 30 '17 at 07:26
  • Is there any recommendation for how to handle plural/singular in the jsp (eg: "User has {0} reviews" isn't correct when the param is "1")? Should we just use something like `c:choose`? – theyuv Dec 10 '17 at 13:21
  • 1
    @theyuv: `User has {0} review{0,choice,0#s|1#|1 – BalusC Dec 10 '17 at 16:39
  • @BalusC Thanks! Had to use `fmt:parseNumber` on the param to get it working. – theyuv Dec 11 '17 at 11:20
  • May I know how to refresh the entire page when I choose the language from list box – Clinton Prakash Feb 20 '18 at 13:30
  • Nowadays *.properties are on first try read as **UTF-8**. Only if there is an encoding error, ISO-8859-1 (Latin-1) is tried. In both cases`\uXXXX` encoding is still possible, though no longer needed. Hurrah! – Joop Eggen Sep 02 '22 at 12:32
26

In addition to what BalusC said, you have to take care about directionality (since English is written Left-To-Right and Arabic the other way round). The easiest way would be to add dir attribute to html element of your JSP web page and externalize it, so the value comes from properties file (just like with other elements or attributes):

<html dir="${direction}">
...
</html>

Also, there are few issues with styling such application - you should to say the least avoid absolute positioning. If you cannot avoid that for some reason, you could either use different stylesheets per (each?) language or do something that is verboten, that is use tables for managing layout. If you want to use div elements, I'd suggest to use relative positioning with "symmetric" left and right style attributes (both having the same value), since this is what makes switching directionality work.

You could find more about Bi-Directional websites here.

Paweł Dyda
  • 18,366
  • 7
  • 57
  • 79
2

based on this tutorial, I am using the following on GAE - Google's App Engine:

A jsp file as follows:

<%@ page import="java.io.* %>
<% 
  String lang = "fr"; //Assign the correct language either by page or user-selected or browser language etc.
  ResourceBundle RB = ResourceBundle.getBundle("app", new Locale(lang));
%>                 

<!DOCTYPE html>
<%@ page contentType="text/html;charset=UTF-8" language="java"%>
<head>
</head>
<body>
  <p>      
    <%= RB.getString("greeting") %>
  </p>
</body>

And adding the files named: app.properties (default) and app_fr.properties (and so on for every language). Each of these files should contain the strings you need as follows: key:value_in_language, e.g. app_fr.properties contains:

greeting=Bonjour!

app.properties contains:

greeting=Hello!

That's all

Ronen Rabinovici
  • 8,680
  • 5
  • 34
  • 46