146

In a URL, should I encode the spaces using %20 or +? For example, in the following example, which one is correct?

www.mydomain.com?type=xbox%20360
www.mydomain.com?type=xbox+360

Our company is leaning to the former, but using the Java method URLEncoder.encode(String, String) with "xbox 360" (and "UTF-8") returns the latter.

So, what's the difference?

Cole Tobin
  • 9,206
  • 15
  • 49
  • 74
MegaByter
  • 1,461
  • 2
  • 9
  • 3
  • 4
    for the benefit of .net developers: HttpUtility.UrlPathEncode uses '%20' HttpUtility.UrlEncode uses '+.' source: http://msdn.microsoft.com/en-us/library/system.web.httputility.urlpathencode(v=vs.110).aspx – CodeToad Sep 01 '14 at 11:03
  • 6
    @MetaByter I think it is more technically correct to phrase the question as "In a URL, should I encode the spaces using %20 or + *in the query part of a URL*?" because while the example you show includes spaces only in the query part, it might not be clear to all readers that the answer depends. Alternatively you could word the question, "In *the specific URL examples below*, should I encode..." – Matthew May 31 '15 at 01:06

5 Answers5

134

Form data (for GET or POST) is usually encoded as application/x-www-form-urlencoded: this specifies + for spaces.

URLs are encoded as RFC 1738 which specifies %20.

In theory I think you should have %20 before the ? and + after:

example.com/foo%20bar?foo+bar
Greg
  • 316,276
  • 54
  • 369
  • 333
  • 11
    Except in email links, because using +es after the ? will result in emails opening with +es still in there. So: `mailto:support@example.org?subject=I%20need%20help` – Sygmoral Feb 19 '15 at 00:33
56

According to the W3C (and they are the official source on these things), a space character in the query string (and in the query string only) may be encoded as either "%20" or "+". From the section "Query strings" under "Recommendations":

Within the query string, the plus sign is reserved as shorthand notation for a space. Therefore, real plus signs must be encoded. This method was used to make query URIs easier to pass in systems which did not allow spaces.

According to section 3.4 of RFC2396 which is the official specification on URIs in general, the "query" component is URL-dependent:

3.4. Query Component The query component is a string of information to be interpreted by the resource.

   query         = *uric

Within a query component, the characters ";", "/", "?", ":", "@", "&", "=", "+", ",", and "$" are reserved.

It is therefore a bug in the other software if it does not accept URLs with spaces in the query string encoded as "+" characters.

As for the third part of your question, one way (though slightly ugly) to fix the output from URLEncoder.encode() is to then call replaceAll("\\+","%20") on the return value.

Adam Batkin
  • 51,711
  • 9
  • 123
  • 115
  • Instead of using URLEncoder which encodes to application/x-www-form-urlencoded, use java.net.URI, which encodes in the true percent encoding. – Su Zhang Mar 25 '14 at 18:04
8

It shouldn't matter, any more than if you encoded the letter A as %41.

However, if you're dealing with a system that doesn't recognize one form, it seems like you're just going to have to give it what it expects regardless of what the "spec" says.

Gary McGill
  • 26,400
  • 25
  • 118
  • 202
5

You can use either - which means most people opt for "+" as it's more human readable.

Fenton
  • 241,084
  • 71
  • 387
  • 401
0

When encoding query values, either form, plus or percent-20, is valid; however, since the bandwidth of the internet isn't infinite, you should use plus, since it's two fewer bytes.

BenGoldberg
  • 415
  • 3
  • 6