I am developing a Text to speech website for Hindi language only. But I am getting the following problem and have been unable to find an appropriate solution.
Input: 102 सुरंगों (for example)
Output: (एक सौ दो)_NSW सà¥à¤°à¤à¤à¥à¤
While the numerical part has been correctly interpreted, the text is wrongly processed. Also, the input changes to the following: 102 सà¥à¤°à¤à¤à¥à¤
The correct output must be: (एक सौ दो)_NSW सुरंगों
This is the code in index.jsp that is called on button click.
function submitPreprocessForm() {
var postData=$("#maryWebClient").serializeArray();
$.ajax({
url: "preprocess",
context:this,
type:"POST",
dataType:"json",
data:postData,
success: function(response){
$("#INPUT_TEXT").val(response["x"]);
$("#INPUT_TEXT2").val(response["y"]);
},
error: function(errorData){alert(errorData);}
});}
And this is the module (part of it) called using urlPatterns for displaying preprocessed text :
Preprocessor.java
protected void processRequest(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException, Exception {
response.setContentType("application/json;charset=utf-8");
PrintWriter out=response.getWriter();
String sent=request.getParameter("INPUT_TEXT");
sent=sent.trim();
NewJFrame2 njf=new NewJFrame2();
String p=njf.read(sent);
Map m1 = new HashMap();
m1.put("x", sent);
m1.put("y", p);
Gson gson = new Gson();
String json = gson.toJson(m1);
System.out.println(json);
out.write(json);
out.close();
}
Kindly guide me to my mistake.
NOTE: I have checked the encoding in the NetBeans project properties and in web.xml. Both are set correctly to "UTF-8".