1

We are planning to have urls like

http://www.nowhere.nu/path/file;key:value,key2:value2.html

This document says the special characters must be escaped, which kind of defeats the purpose of having "nice" urls:

http://www.blooberry.com/indexdot/html/topics/urlencoding.htm

However, testing the major browsers, it seems unnecessary to urlencode ;: and , in this position

Can you provide a popular browser where it doesn't work, or an argument against such a scheme?

Similar question for & (invalid but works) versus & amp; (valid).

Charles
  • 50,943
  • 13
  • 104
  • 142
Adder
  • 5,708
  • 1
  • 28
  • 56
  • 1
    http://stackoverflow.com/questions/1547899/which-characters-make-a-url-invalid. In what world is that a "nice" URL? I'd argue that a nice URL would be one you can verbalize without having to go character by character. The spec that that site references has since been replaced so additional characters (like `;`) are valid – Matt Whipple Oct 26 '12 at 12:48
  • Thanks, I looked at the answer to 1547899 already, but I'm actually more interested in the parts where the RFC says it not valid, but still advises how it SHOULD be handled. Also, it wasn't really my idea, you know. – Adder Oct 26 '12 at 12:52

1 Answers1

3

As mentioned in the comments the information referenced is outdated. Any questionable characters should be escaped for safety.

With URLs you have 2 target audiences, humans and machines. Most humans cannot efficiently process URL paths that contain anything but alphanumeric characters, .s, and /s. They also degrade in efficiency the further along you go past the host name. So essentially once you get to a character that you would have to encode, you've already lost being human friendly (and many don't even know where the location bar is let alone pay any attention to it).

Machines on the other hand don't have a problem with escaped characters. They may have a problem with un-escaped characters that they don't recognize. As HTTP is incredibly simple there are tons of clients available including little custom clients for doing things like mashups. So even if you cover the major browsers being able to plow through your unescaped characters, you may break some unforeseen client. Overall you're left with no practical benefit but higher risk.

Matt Whipple
  • 7,034
  • 1
  • 23
  • 34