30

I've noticed that Java's UriBuilder isn't encoding the : characters included in my query parameter values (ISO 8601-formatted strings).

According to Wikipedia, it seems colon should be encoded.

In particular, encoding the query string uses the following rules:

  • Letters (A-Z and a-z), numbers (0-9) and the characters '.','-','~' and '_' are left as-is
  • SPACE is encoded as '+' or %20[citation needed]
  • All other characters are encoded as %FF hex representation with any non-ASCII characters first encoded as UTF-8 (or other specified encoding)

So, what's the deal? Should colons in query parameters be encoded or not?


Update:

I looked up the URI Syntax spec (RFC 3986) and it looks like encoding colons in query params really isn't necessary. Here's an excerpt from the ABNF for URI:

URI         = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
query       = *( pchar / "/" / "?" )
pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"
unreserved    = ALPHA / DIGIT / "-" / "." / "_" / "~"
pct-encoded   = "%" HEXDIG HEXDIG
sub-delims    = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "=
Community
  • 1
  • 1
HolySamosa
  • 9,011
  • 14
  • 69
  • 102
  • 1
    Correct me if I'm wrong, but per your link ":" is a reserved gen-delim and "any [gen-delims] are also in the reserved set are 'reserved' for use as subcomponent delimiters within the component" (ie sub-delims) – Enrico Dec 05 '12 at 16:21
  • I'm just interpreting the ABNF, which allows ':' as part of the query strings. This also matches up with the behavior of Java's UriBuilder as well as some code I tested on .NET. Still, it's confusing as you point out the text suggests that it should perform differently. – HolySamosa Dec 06 '12 at 20:44

2 Answers2

12

Yes, they should be encoded in a query string. The correct encoding is %3A

However, I can understand why UriBuilder isn't encoding :. You don't want to encode the colon after the protocol (eg http:) or between the username and password (eg ftp://username:password@domain.com) in an absolute URI.

Enrico
  • 10,377
  • 8
  • 44
  • 55
  • 2
    Thanks, Enrico. It seems that sources conflict. While Wikipedia (and elsewhere) say that they should be encoded, if you look at the ABNF in the URI syntax spec, it seems that they don't. See the update to my answer. – HolySamosa Dec 05 '12 at 15:25
  • 1
    This is technically incorrect. Colons are allowed as is in query. See. https://stackoverflow.com/a/5330261/125562 – Basilevs Apr 26 '21 at 16:19
5

There's no UriBuilder in the Java SDK, it is defined by JAX-RS. It's documentation states query parameters should be URL encoded, other components are encoded using RFC 3986.

Builder methods perform contextual encoding of characters not permitted in the corresponding URI component following the rules of the application/x-www-form-urlencoded media type for query parameters and RFC 3986 for all other components

However, the Jersey implementation of JAX-RS doesn't play by this spec, and encodes everything according to RFC 3986. It is a bug, see the JIRA ticket.

andras
  • 6,339
  • 6
  • 26
  • 22
  • 1
    Interestingly the two methods `UriBuilder.queryParam(name, value)` seems to have different encoding rules than `UriBuilder.replaceQuery(query)`. The former encodes the `':'` character, while the latter does not, at least in RESTEasy 3.0.7.Final. Is this intended behavior, and do you have an explanation of why there exists the difference? – Garret Wilson Sep 16 '14 at 20:22