Yes, while it will work for "simple" cases, you might encounter a) exceptions or b) unexpected behaviour if calling URLDecoder.decode
for an unencoded URL that contains certain special chars.
Consider the following example: It will throw a java.lang.IllegalArgumentException: URLDecoder: Incomplete trailing escape (%) pattern
for the third test and it will alter the URL without exception for the second test (while the regular encoding/decoding works without issues):
import java.net.URLDecoder;
import java.net.URLEncoder;
public class Test {
public static void main(String[] args) throws Exception {
test("http://www.foo.bar/");
test("http://www.foo.bar/?q=a+b");
test("http://www.foo.bar/?q=äöüß%"); // Will throw exception
}
private static void test(String url) throws Exception {
String encoded = URLEncoder.encode(url, "UTF-8");
String decoded = URLDecoder.decode(encoded, "UTF-8");
System.out.println("encoded: " + encoded);
System.out.println("decoded: " + decoded);
System.out.println(URLDecoder.decode(decoded, "UTF-8"));
}
}
Output (notice how the +
sign disappears):
encoded: http%3A%2F%2Fwww.foo.bar%2F
decoded: http://www.foo.bar/
http://www.foo.bar/
encoded: http%3A%2F%2Fwww.foo.bar%2F%3Fq%3Da%2Bb
decoded: http://www.foo.bar/?q=a+b
http://www.foo.bar/?q=a b
encoded: http%3A%2F%2Fwww.foo.bar%2F%3Fq%3D%C3%A4%C3%B6%C3%BC%C3%9F%25
decoded: http://www.foo.bar/?q=äöüß%
Exception in thread "main" java.lang.IllegalArgumentException: URLDecoder: Incomplete trailing escape (%) pattern
at java.net.URLDecoder.decode(Unknown Source)
at Test.test(Test.java:16)
See the javadoc of URLDecoder for the two cases as well:
- The plus sign "+" is converted into a space character " " .
- A sequence of the form "%xy" will be treated as representing a byte where xy is the two-digit hexadecimal representation of the 8 bits.
Then, all substrings that contain one or more of these byte sequences
consecutively will be replaced by the character(s) whose encoding
would result in those consecutive bytes. The encoding scheme used to
decode these characters may be specified, or if unspecified, the
default encoding of the platform will be used.
If you are sure that your unencoded URLs do not contain +
or %
then I'd say it's safe to call URLDecoder.decode
. Otherwise I'd advise to implement additional checks, e.g. try to decode and compare with the original (cf. this question on SO).