6

Consider this following code (that retrieves the response from a HTTP request and prints it). NOTE: This code works in a standard Java application. I only experience the problem listed below when using the code in an Android application.

public class RetrieveHTMLTest {

public static void main(String [] args) {
    getListing(args[0);
}

public static void getListing(String stringURL) {

    HttpURLConnection conn = null;
    String html = "";
    String line = null;
    BufferedReader reader = null;
    URL url = null;

    try {
        url = new URL(stringURL);

        conn = (HttpURLConnection) url.openConnection();

        conn.setConnectTimeout(6000);
        conn.setReadTimeout(6000);
        conn.setRequestMethod("GET");

        reader = new BufferedReader(new InputStreamReader(conn.getInputStream()));
        conn.connect();

        while ((line = reader.readLine()) != null) {
            html = html + line;
        }

        System.out.println(html);

        reader.close();
        conn.disconnect();
    } catch (Exception ex) {
        ex.printStackTrace();
    } finally {

    }
}   
}

If I supply the URL: http://somehost/somepath/

The following code works fine. But, if I change the URL to: http://somehost/somepath [a comment]/ The code throws a timeout exception because of the "[" and "]" characters.

If I change the URL to: http://somehost/somepath%20%5Ba%20comment%5D/ The code works fine. Again, because the "[" and "]" characters aren't present.

My question is, how do I get the URL:

http://somehost/somepath [a comment]/

into the following format:

http://somehost/somepath%20%5Ba%20comment%5D/

Also, should I continue using HttpURLConnection in Android since it can't accept a URL with special characters? If the standard to always convert the URL before using HttpURLConnection?

William Seemann
  • 3,440
  • 10
  • 44
  • 78

2 Answers2

14

Use the URLEncoder class :

URLEncoder.encode(value, "utf-8");

You can find more details here.

Edit : You should use this method only to encode your parameter values. DO NOT encode the entire URL. For example if you have a url like : http://www.somesite.com?param1=value1&param2=value2 then you should only encode value1 and value2 and then form the url using encoded versions of these values.

Arnab Chakraborty
  • 7,442
  • 9
  • 46
  • 69
  • 1
    This method doesn't appear to work as expected:
    System.out.println(URLEncoder.encode("http://somehost/somepath [a comment]/", "utf-8")); prints: http%3A%2F%2Fsomehost%2Fsomepath+%5Ba+comment%5D%2F Which will cause a MalformedURLException
    – William Seemann Dec 06 '11 at 04:49
  • If you don't like the encoding, you can check out the different encoding schemes mentioned in the link I gave you. Your URL will vary according to your encoding scheme. For example in utf-8 spaces are encoded as + while in some other encoding scheme it maybe encoded as %20. Both are acceptable, as long as you are decoding with the same scheme which you used to encode. – Arnab Chakraborty Dec 06 '11 at 04:59
  • I appreciate the help. The encode method itself doesn't throw the MalformedURLException. If you try to create a new URL with the output of the encode method (using utf-8) this will cause the exception. – William Seemann Dec 06 '11 at 05:01
  • My apologies. I have updated my answer. Please check it. I had run into the same problem but had very conveniently forgotten about it. – Arnab Chakraborty Dec 06 '11 at 05:05
  • OK, re-tested using URLEncoder.encode("/somepath [a comment]/", "utf-8"); The output is "%2Fsomepath+%5Ba+comment%5D%2F". If I append this to http://somehost and try to create a URL using the string I still get a MalformedURLException. – William Seemann Dec 06 '11 at 05:14
  • In your link http://somehost/somepath [a comment]/ what does [a comment] signify? Is it a part of the url, or is it a get parameter, what is it exactly? – Arnab Chakraborty Dec 06 '11 at 05:53
  • "somehost/somepath [a comment]/" is the path portion of the URL. An example URL would be http:// 23.23.23.23.:4444/mp3/Guitar [live]/ – William Seemann Dec 06 '11 at 15:49
  • Is there a way to encode JUST characters like `(` and `[` leaving `/` and `&` and `?` untouched? – Andrew Wyld Dec 13 '12 at 12:27
  • @AndrewWyld Say your url is http://www.somesite.com?param1=value1&param2=value2, encode just the values, like this: url = "http://www.somesite.com?param1=" + URLEncoder.encode(value1, "utf-8") + "&param2=" + URLEncoder.encode(value2, "utf-8"); – Arnab Chakraborty Dec 14 '12 at 05:26
  • I don't want to encode parameters. I want to encode something like `http://thisisawebsite.com/path/identifier/image (1).jpg`. – Andrew Wyld Dec 14 '12 at 11:39
2
url = URLEncoder.encode(value, "utf-8");
url = url.replaceAll("\\+", "%20");

the "+" may not be revert

brian
  • 6,802
  • 29
  • 83
  • 124