2

I have a utility class TestCracker. It has a testInput method that takes text, sends a request to a translating service with that text as a parameter and returns a response JSON String:

public class TestCracker  {
    private String ACCESS_TOKEN = "XXXXXXXXXX";

    public static void main(String[] args) {
        System.out.println(new TestCracker().testInput("Lärm"));
    }

    public String testInput(String text)  {
        String translateLink = "https://translate.yandex.net/api/v1.5/tr.json/translate" +
                "?key=" + ACCESS_TOKEN + "&text=" + text +
                "&lang=de-en" + "&format=plain" + "&options=1";

        try {
            URL translateURL = new URL(translateLink);

            HttpURLConnection connection = (HttpURLConnection) translateURL.openConnection();
            setupGETConnection(connection);

            connection.connect();

            InputStream input = connection.getInputStream();
            String inputString = new Scanner(input, "UTF-8").useDelimiter("\\Z").next();
            JSONObject jsonObject = new JSONObject(inputString);

            return text + "; " + inputString;
        }
        catch (Exception e) {
            System.out.println("Couldn't connect " + e);

            return "None";
        }
    }

    private void setupGETConnection(HttpURLConnection connection) throws Exception  {
        connection.setRequestMethod("GET");
        connection.setDoOutput(true);
        connection.setInstanceFollowRedirects(false);
    }
}

In method main I tried displaying response JSON for string Lärm. It works fine:

Lärm; {"code":200,"detected":{"lang":"de"},"lang":"de-en","text":["Noise"]}

However, when I try to run and display the same thing using Servlet and browser, instead of just IDE:

public class TestServlet extends HttpServlet {
    public void doPost(HttpServletRequest request, HttpServletResponse response)
            throws IOException, ServletException {
        String resultPath;
        request.setCharacterEncoding("UTF-8");

        response.getWriter().print(request.getParameter("input-text2"));
        response.getWriter().println(new TestCracker().testInput(request.getParameter("input-text2")));
    }
}

When run, the TestServlet outputs:

LärmLärm; {"code":200,"detected":{"lang":"en"},"lang":"de-en","text":["L?rm"]}

As can be seen, the word Lärm was gotten from a form just fine - the first word in the response String gets displayed right (first word), the testInput got the right word too (second word), but the response from translation service is wrong (the part after ;): the service couldn't translate and returned a corrupted version of the initial word: L?rm.

I don't understand why this happens. Where does the mistake occurs if the right word got passed to the method? And if the method when run inside IDE returns correct translation ('Noise')?

parsecer
  • 4,758
  • 13
  • 71
  • 140
  • Have you tried URL encoding the text before adding it to the URL - `java.net.URLEncoder.encode(text)`? I believe it is more appropriate, even though it happens to work when run from inside the IDE. – Nikos Paraskevopoulos Feb 12 '19 at 10:18
  • @Nikos Paraskevopoulos It didn't seem to help ;( – parsecer Feb 26 '19 at 14:40
  • @Nikos Paraskevopoulos Another version of this method [from this answer](https://stackoverflow.com/a/213519/4759176) has worked. If you post your answer, I'll accept it. – parsecer Feb 26 '19 at 14:52

1 Answers1

1

If you are using Tomcat, then the URIEncoding has to be set properly. If the parameters are on URL (GET). This must be done in the server.xml , where the connector is defined.

<Server port="8005" shutdown="SHUTDOWN">
    <Service name="Catalina">
        <Connector URIEncoding="UTF-8" port="8080"/>
        <Engine defaultHost="localhost" name="Catalina">
            <Host appBase="webapps" name="localhost"/>
        </Engine>
    </Service>
</Server>

Alternatively, If you don't want to play around server setting, read with encoding support.

Like

response.getWriter()
.println(new TestCracker()
.testInput(
new String(request.getParameter("input-text2").getBytes(),"UTF-8"))
);

The response.getWriter().print() has the default utf-8 printing capability, so you are able to see the output for that with proper character.

The first approach is better, as it will solve the issues for whole application.

hakamairi
  • 4,464
  • 4
  • 30
  • 53
Kris
  • 8,680
  • 4
  • 39
  • 67