19

I see that java.net.URLDecoder.decode(String) is deprecated in 6.

I have the following String:

String url ="http://172.20.4.60/jsfweb/cat/%D7%9C%D7%97%D7%9E%D7%99%D7%9D_%D7%A8%D7%92%D7%99%D7%9C%D7%99%D7%9"

How should I decode it in Java 6?

BalusC
  • 1,082,665
  • 372
  • 3,610
  • 3,555
danny.lesnik
  • 18,479
  • 29
  • 135
  • 200

5 Answers5

58

You should use java.net.URI to do this, as the URLDecoder class does x-www-form-urlencoded decoding which is wrong (despite the name, it's for form data).

Draemon
  • 33,955
  • 16
  • 77
  • 104
  • 6
    @whoever downvoted: care to elaborate on which part of this is wrong? – Draemon Feb 17 '12 at 00:14
  • 3
    This is the correct answer! This trips people up all the time. URLEncoder/URLDecoder encode and decode form data *for* URLs, not URLs themselves. The URL class provides the encoding and decoding of the URL itself. And the URI class is an updated, better specified, more general API -- every URL string is also a URI string, so use URI for parsing duties. The URL class itself warns against confusing the use of URLEncoder/Decoder: "The URLEncoder and URLDecoder classes can also be used, but only for HTML form encoding, which is not the same as the encoding scheme defined in RFC2396." – Bob Kerns Oct 23 '12 at 17:02
  • 2
    java.net.URI.decode() is private now – Azee Feb 20 '14 at 16:03
  • 3
    The *media*-type `application/x-www-form-urlencoded` refers to the encoding used for URL's, and the detailed rules specified by `URLDecoder` make it clear that it's perfectly valid for use in decoding a URL. So it's simpler, and probably faster to use `URLDecoder`. – Lawrence Dol Dec 04 '14 at 20:30
  • 3
    URLDecoder will replace "+" with " ", which is incorrect. "+" should only be changed to " " in the query string keys and values. – Dobes Vandermeer Feb 28 '16 at 20:12
27

Now you need to specify the character encoding of your string. Based off the information on the URLDecoder page:

Note: The World Wide Web Consortium Recommendation states that UTF-8 should be used. Not doing so may introduce incompatibilites.

The following should work for you:

java.net.URLDecoder.decode(url, "UTF-8");

Please see Draemon's answer below.

Community
  • 1
  • 1
  • 4
    -1 this is just plain wrong. The documentation clearly states that this method uses application/x-www-form-urlencoded which is only used for the query string. – Draemon Feb 17 '12 at 00:13
  • -1 see my comments on @Draemon's correct answer below. – Bob Kerns Oct 23 '12 at 17:04
  • 3
    This would be the correct answer, if the question were correct! If you were using the one-arg version of decode() correctly, you should use the two-argument version. – Bob Kerns Oct 23 '12 at 17:06
  • +1 For directing users to the other answer. :) – 700 Software Feb 13 '14 at 16:14
  • 1
    This answer is in fact correct, since the form encoding referenced defers to URL encoding. The *media*-type `application/x-www-form-urlencoded` refers to the encoding used for URL's, and the detailed rules specified by `URLDecoder` make it clear that it's perfectly valid for use in decoding a URL. So it's simpler, and probably faster to use `URLDecoder`. I recommend that you unstrike this answer. – Lawrence Dol Dec 04 '14 at 20:31
7

As the documentation mentions, decode(String) is deprecated because it always uses the platform default encoding, which is often wrong.

Use the two-argument version instead. You will need to specify the encoding used n the escaped parts.

Joachim Sauer
  • 302,674
  • 57
  • 556
  • 614
5

Only the decode(String) method is deprecated. You should use the decode(String, String) method to explicitly set a character encoding for decoding.

Mathias Schwarz
  • 7,099
  • 23
  • 28
2

As noted by previous posters, you should use java.net.URI class to do it:

System.out.println(String.format("Decoded URI: '%s'", new URI(url).getPath()));

What I want to note additionally is that if you have a path fragment of a URI and want to decode it separately, the same approach with one-argument constructor works, but if you try to use four-argument constructor it does not:

String fileName = "Map%20of%20All%20projects.pdf";
URI uri = new URI(null, null, fileName, null);
System.out.println(String.format("Not decoded URI *WTF?!?*: '%s'", uri.getPath()));

This was tested in Oracle JDK 7. The fact that this does not work is counter-intuitive, runs contrary to JavaDocs and it should be probably considered a bug.

It could trip people who are trying to use an approach symmetrical to encoding. As noted for example in this post: "how to encode URL to avoid special characters in java", in order to encode URI, it's a good idea to construct a URI by passing different URI parts separately since different encoding rules apply to different parts:

String fileName2 = "Map of All projects.pdf";
URI uri2 = new URI(null, null, fileName2, null);
System.out.println(String.format("Encoded URI: '%s'", uri2.toASCIIString()));
Community
  • 1
  • 1
Dima Korobskiy
  • 1,479
  • 16
  • 26