0

I am trying to read a url which is throwing a string. I am storing that string in some variable and trying to print that variable on my web page using jsp. When I print my string on my web page it is giving some junk characters. How can I get the original string?

Here is my jsp code:

Market.jsp

<%@page contentType="text/html;charset=UTF-8" pageEncoding="UTF-8" %>
<html>
<head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
    <title>JSP Page</title>
</head>
<body>

<%

    URL url;
    ArrayList<String> list1 = new ArrayList<String>();
    ArrayList<String> list2 = new ArrayList<String>();
    List commodity1 = null;
    List price1 = null;
    int c, p = 0;
    try {
        // get URL content

        String a = "http://122.160.81.37:8080/mandim/MarketWise?m=agra";
        url = new URL(a);
        URLConnection conn = url.openConnection();
        // open the stream and put it into BufferedReader
        BufferedReader br = new BufferedReader(
                new InputStreamReader(conn.getInputStream()));

        StringBuffer sb = new StringBuffer();
        String inputLine;
        while ((inputLine = br.readLine()) != null) {
            System.out.println(inputLine);
            //  sb.append(inputLine);
            String s = inputLine.replace("|", "\n");
            s = s.replace("~", " ");
            StringTokenizer str = new StringTokenizer(s);
            while (str.hasMoreTokens())
            {
                String mandi = str.nextElement().toString();
                String price = str.nextElement().toString();
                list1.add(mandi);
                list2.add(price);
            }
        }
        commodity1 = list1.subList(0, 10);

        // commodity10=list1.subList(90,100);
        price1 = list2.subList(0, 10);

        int c1 = 0;
        int p1 = 0;
        for (c1 = 0, p1 = 0; c1 < commodity1.size() && p1 < price1.size(); c1++, p1++) {
            String x = (String) commodity1.get(c1);
            String y = (String) price1.get(p1);
            out.println(x);
            out.println(y);
        }

        br.close();

        //System.out.println(sb);

    } catch (MalformedURLException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    }
%>


</body>
</html>

I am getting the following output

धान 1325 चावल 2050 ज�वर 920 जौ 810 मकई 1280 गेहू� 1420 जो 1050 बेजर - जय 800 उड़द 3600

How can I achieve my desired goal?

Thanks in advance

yunzen
  • 32,854
  • 11
  • 73
  • 106
user3585120
  • 15
  • 2
  • 8
  • post the output you got from the above code? – Santhosh May 02 '14 at 07:16
  • i have posted my output which i am getting – user3585120 May 02 '14 at 07:18
  • I think your question is more about parsing a string than about receiving data from the url. Possibliy it's a charset issue. Do you have a linux box? Then I would use wget to analyze the source data. On windows, try to open it in a web browser and save it as a file. Publish this file if it's not confidential. – Daniel Alder May 02 '14 at 07:26
  • 1
    This is how it look when I check the original text: `धान~1325|चावल~2050|ज्वर~920|जौ~810|मकई~1280|गेहूँ~1420|` Obiously there is some encoding problem _on the server side_ - or possibly you are expecting asian product names? – Daniel Alder May 02 '14 at 07:30
  • these are asian product name i am trying to access then – user3585120 May 02 '14 at 07:32
  • How can i get this string on my jsp page – user3585120 May 02 '14 at 07:35
  • I put the code in a pure java program and called it (on linux) and everything seems to look fine. Do your `out.println(x);` commands return the result to console or to the browser? In case it's the console, it's possible that it doesn't work with a western charset. In case your code returs to the web page, you might need to check the page encoding and that the output is in the same encoding. probably you also need to use some kind of escaping - possibly it's good anyway to use `${yourvariable}` semantics for output. unfortunately, I don't have jsp environment – Daniel Alder May 02 '14 at 07:45
  • @Daniel Alder when i puting it on pure java program i m also getting every thing fine prob is with only web page – user3585120 May 02 '14 at 07:51
  • Did you already checkout this link? http://stackoverflow.com/questions/17919998/escape-all-strings-in-jsp-spring-mvc – Daniel Alder May 02 '14 at 07:52
  • sry actually i am not faimlier with jstl – user3585120 May 02 '14 at 08:00
  • @user3585120 Then I can't help you. You need to do something like this, but I don't know how. The second answer looks very helpful to me, but there might also be better links... – Daniel Alder May 02 '14 at 08:06

1 Answers1

0

I think it'a an encoding issue on your system. I don't know JSP enough to tell you what, but when running your code as a pure java application on linux and changing out.println(); into System.out.println(); I can see the output as expected. (side node: the product names are asian names, so don't be as surprised as I was. expected in this case means that the characters are the same as when I do a wget call to the URL).

This means: your code is fine: It loads what you want. The problem is the presentation. HTML pages have their own encoding. I guess JSP makes this transparently (-> here I need external input how to do this), but the result must have one of this three solutions:

  • your page has a western encoding, and is not able to support asian characters. In this case your strings need to be encoded like this: &#8472; or &#x2118;
  • your page is utf8 or unicode encoded and directly supports this characters
  • even on utf8 encoded pages you can use encodings like in the first example

Whatever you chose to use: your output must match the format. This also means that your code needs to know the selected character set. And I'm sure JSP does. If you want to use the default implemented encoding, you need to find a function for this. have a look to Escape all strings in JSP/Spring MVC. This cannot be too hard.

Only if you are really crazy but don't know how to do it, use something like this (it's a hack!) function to encode your string:

private String encode(String str) {
    StringBuilder sb = new StringBuilder();
    for (char ch : str.toCharArray())
        if (ch < 128)
            sb.append(ch);
        else {
            sb.append("&#x");
            String hx = Integer.toHexString(ch);
            while (hx.length() < 4)
                hx = "0" + hx;
            sb.append(hx);
            sb.append(";");
        }
    return sb.toString();
}
Community
  • 1
  • 1
Daniel Alder
  • 5,031
  • 2
  • 45
  • 55