2

I am developing a Text to speech website for Hindi language only. But I am getting the following problem and have been unable to find an appropriate solution.

Input: 102 सुरंगों (for example)

Output: (एक सौ दो)_NSW सà¥à¤°à¤à¤à¥à¤

While the numerical part has been correctly interpreted, the text is wrongly processed. Also, the input changes to the following: 102 सà¥à¤°à¤à¤à¥à¤

The correct output must be: (एक सौ दो)_NSW सुरंगों

This is the code in index.jsp that is called on button click.

function submitPreprocessForm() {

 var postData=$("#maryWebClient").serializeArray();

 $.ajax({
     url: "preprocess",
     context:this,
     type:"POST",
     dataType:"json",
     data:postData,

     success: function(response){
         $("#INPUT_TEXT").val(response["x"]);
         $("#INPUT_TEXT2").val(response["y"]);
     },
             error: function(errorData){alert(errorData);}
 });}

And this is the module (part of it) called using urlPatterns for displaying preprocessed text :

Preprocessor.java

protected void processRequest(HttpServletRequest request, HttpServletResponse response)
            throws ServletException, IOException, Exception {
        response.setContentType("application/json;charset=utf-8");
        PrintWriter out=response.getWriter();

        String sent=request.getParameter("INPUT_TEXT");
        sent=sent.trim();
        NewJFrame2 njf=new NewJFrame2();  
        String p=njf.read(sent);

        Map m1 = new HashMap(); 
         m1.put("x", sent);
         m1.put("y", p);

        Gson gson = new Gson();
        String json = gson.toJson(m1);
        System.out.println(json);
        out.write(json);
        out.close();
    }

Kindly guide me to my mistake.

NOTE: I have checked the encoding in the NetBeans project properties and in web.xml. Both are set correctly to "UTF-8".

coder0007
  • 161
  • 2
  • 11
  • 1
    You have to take the input in UTF-8 as well using the `request.getParameter()` method. See this quesion: [HttpServletRequest UTF-8 Encoding](http://stackoverflow.com/questions/16527576/httpservletrequest-utf-8-encoding) – progyammer Oct 04 '16 at 05:32
  • From where you send response from controller? – Ataur Rahman Munna Oct 04 '16 at 05:37
  • @progy_rock How am I to do that, could you please be a bit more specific? – coder0007 Oct 04 '16 at 05:42
  • @Ataur Rahman Munna Is this what you are asking? – coder0007 Oct 04 '16 at 05:43
  • @coder0007 Did you check the link I gave? – progyammer Oct 04 '16 at 05:44
  • No, is your webapp only servlet app or use spring-mvc? – Ataur Rahman Munna Oct 04 '16 at 05:47
  • Yes I did. I also considered that I may have encoded it more than once. So I removed the **response.setContentType("application/json;charset=utf-8");** and got the following result Input: **102 सुरंगों** Output: **(?? ?? ??)_NSW सुरंगों ** – coder0007 Oct 04 '16 at 05:48
  • @Ataur Rahman Munna Only servlet app. I am not using Spring mvc. – coder0007 Oct 04 '16 at 05:48
  • 1
    If you set `response.setContentType("application/json;charset=utf-8");` then i think your problem reside in jsp. put the code at top in your jsp `<%@page contentType="text/html" pageEncoding="UTF-8"%>` – Ataur Rahman Munna Oct 04 '16 at 05:53
  • @Ataur Rahman Munna I have already done that. Thank you for your time but my problem is now solved – coder0007 Oct 04 '16 at 05:54
  • @progy_rock I implemented the mentioned solution and it did the trick. Thank you for your time and suggestion. – coder0007 Oct 04 '16 at 05:55
  • 1
    You should mention how you solve the problem, which may help out others if they face same problem. – Ataur Rahman Munna Oct 04 '16 at 06:01
  • 1
    @Ataur Rahman Munna As I mentioned in my previous comment, I followed the solution given in the link mentioned by progy_rock, namely, I added **request.setCharacterEncoding("UTF-8")** before any call to request functions in Preprocesor.java. – coder0007 Oct 04 '16 at 06:06
  • Every post on stackoverflow should have an accepted answer that is helpful for future visitors. If you found the solution, add an answer to it or, better, delete it as it has chances of getting marked as duplicate. – progyammer Oct 04 '16 at 06:44

2 Answers2

0

Adding request.setCharacterEncoding("UTF-8") before any getParameter() call solves this problem.

coder0007
  • 161
  • 2
  • 11
0

change the very first line of the JSP page from

<%@ page language="java" contentType="text/html; charset=ISO-8859-1" pageEncoding="ISO-8859-1"%>

to

<%@ page language="java" contentType="text/html; charset=UTF-8" pageEncoding="UTF-8"%>
anilam
  • 694
  • 6
  • 7