0

URLs can only be sent over the Internet using the ASCII character-set. Since URLs often contain characters outside the ASCII set, the URL has to be converted into a valid ASCII format.

Why URLs contain the only ASCII set, Why URLs should be encoded ?

R H
  • 387
  • 3
  • 13
  • See http://stackoverflow.com/questions/2742852/unicode-characters-in-urls, and [the answer that talks about IRI vs. URI](http://stackoverflow.com/a/2744184/166390) in particular.. –  Aug 10 '12 at 06:22

2 Answers2

2

URLs are limited to ASCII because that is how they were defined in the RFC. While the original RFC has been updated by others over time, section 2.2 of RFC 1738 is a good place to start for more information.

Community
  • 1
  • 1
pizen
  • 464
  • 3
  • 4
2

Why URLs contain the only ASCII set?

Otherwise, you would need a different naming standard for systems that aren't 8-bit clean. At the time the URL standard was developed, that wasn't terribly uncommon. Remember, URLs are not specific to the Internet. They're supposed to be a universal naming standard.

Why URLs should be encoded ?

Otherwise you won't get the resource you're looking for. If you decide that "no" means yes and "yes" means no, nobody will understand what you're talking about. That's why we have standards.

David Schwartz
  • 179,497
  • 17
  • 214
  • 278