RFC1738 SEC 2.2 says that:
Thus, only alphanumerics, the special characters "$-_.+!*'(),", and reserved characters used for their reserved purposes may be used unencoded within a URL.
After some searching, I summarized that there are three types of characters that should be encoded:
- Unsafe characters: '#', ' ', '"', '%', '<', '>', '{', '}', "|", '', '^', '~', '[', ']', '`'.
- Reserved characters: ';', '/', '?', ':', '@', '=', '&'.
- Special characters: "$-_.+!*'(),"
I know why and when unsafe characters and reserved characters should be encoded. RFC1738 states that special characters can be used uncoded, but I found that urllib2.quote
also encode these special characters to "%24-_.%2B%21%2A%27%28%29%2C%7E"
. So, I am a little confused about
why special characters are encoded if they can be used unencoded within a URL and why they are special.