39

When debugging in ASP.NET MVC, I don't see a difference between:

http://mysite.com?q=hi,bye

and

http://mysite.com?q=hi%2Cbye

The querystring param "q" always has a value of "hi,bye".

So why is the comma encoded?

I want to do something like this https://stackoverflow.com/a/752109/173957.

I have this form:

<form method="GET" action="/Search">
     <input type="hidden" name="q" value="hi,bye"/>
     <input type="submit" value="ok"/>
</form>

How can I prevent this value from being encoded?

Community
  • 1
  • 1
Scott Coates
  • 2,462
  • 5
  • 31
  • 40
  • 3
    *Why* do you want to prevent it from being encoded? ASP.NET will automatically decode it for you, so what's the problem? – Jon Jan 12 '12 at 00:35
  • 6
    I guess ?q=hi,bye is a little more readable than ?q=hi%2Cbye. Also, I'm mostly just curious. – Scott Coates Jan 12 '12 at 00:54
  • Years ago, I explicitly used a comma in my query string value for the _specific_ reason it was _not encoded_, and thus easily readable in the address bar. A shame that some libraries/browsers now encode it. – Toddius Zho Aug 12 '15 at 02:55
  • Possible duplicate of http://stackoverflow.com/questions/2366260/whats-valid-and-whats-not-in-a-uri-query – Ian Kemp Jan 13 '17 at 12:14
  • Possible duplicate of [What's valid and what's not in a URI query?](http://stackoverflow.com/questions/2366260/whats-valid-and-whats-not-in-a-uri-query) – Ian Kemp Jan 13 '17 at 12:14

4 Answers4

21

The URI spec, RFC 3986, specifies that URI path components not contain unencoded reserved characters and comma is one of the reserved characters. For sub-delims such as the comma, leaving it unencoded risks the character being treated as separator syntax in the URI scheme. Percent-encoding it guarantees the character will be passed through as data.

Community
  • 1
  • 1
Kyle Jones
  • 5,492
  • 1
  • 21
  • 30
  • 43
    In the question, the comma is not in the URI path component, but in the URI query component, which, according to RFC 3986, may contain sub-delims, which include the comma. – Nick Russo Feb 06 '15 at 05:53
  • 2
    If am am reading the spec correctly: `path = path-absolute` => `path-absolute = "/" [ segment-nz *( "/" segment ) ]` => `segment = *pchar` => `pchar = unreserved / pct-encoded / sub-delims / ":" / "@"` => `sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="`. So a comma is a valid in a URI segment, query, or fragment. – joeyhoer Dec 22 '15 at 02:14
  • @joeyhoer `So a comma is a valid in a URI segment`, probably you mean `invalid` – Webber Nov 12 '20 at 11:16
  • 1
    @Webber: No: a comma is **valid** in a segment, because `segment`s are made up of `pchar`s (path characters), `pchar`s may include `sub-delims`, and `sub-delims` include commas. – wchargin Apr 15 '21 at 22:34
  • Updated link to the URI spec: https://datatracker.ietf.org/doc/html/rfc3986 – Jordan Schnur Jul 21 '21 at 16:46
9

I found this list of characters that do not require URL encoding: http://web.archive.org/web/20131212154213/http://urldecoderonline.com/url-allowed-characters.htm

Update
Since the original link broke, I used archive.org to get the following text from the page from on December 2013

List of allowed URL characters

Unreserved - May be encoded but it is not necessary

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
a b c d e f g h i j k l m n o p q r s t u v w x y z
0 1 2 3 4 5 6 7 8 9 - _ . ~

Reserved - Have to be encoded sometimes

! * ' ( ) ; : @ & = + $ , / ? % # [ ]
Alex
  • 9,250
  • 11
  • 70
  • 81
2

This is really browser dependent. The browser takes the HTML form and decides how to build the URL based on the form's inputs.

If you're using a really old (or poorly programmed) browser, it may not encode the comma. If you adhere to RFC standards, it really should be encoded.

If you want to prevent the comma from being encoded for all browsers, you would have to use JavaScript and build the URL yourself.

<script lang="JavaScript">
    document.location.href = "/Search?q=hi,bye";
</script>

In any case, it shouldn't matter, because you should be decoding the querystring parameters anyway, and the result will be the same.

JackAce
  • 1,407
  • 15
  • 32
-1

there are several characters that hold special meaning(like + ? # etc) or are directly not allowed(like space, comma etc) in a URL. to use such characters in a URL, u need to encode and decode them. Read more Here

ASP.NET automatically encodes and decodes all required characters like this so u need not worry about them.

PC.
  • 6,870
  • 5
  • 36
  • 71
  • 1
    But it doesn't really make sense that the comma is encoded. Even in the link you provide, the comma is not mentioned as an illegal character. Even in the try-it-out part of the link you provided, "hi,bye" is not any different after encoding it. – Scott Coates Jan 12 '12 at 01:01
  • 2
    Comma has special meaning in URLs, because it denotes segment parameters. See [this](http://en.wikipedia.org/wiki/URI_scheme#Official_IANA-registered_schemes) link. Look for data, geo and ldap schemes – PC. Jan 12 '12 at 11:07