38

Just trigger in my mind when I was going through some websites were they having upper case and lower case combination in url something like http://www.domain.com/Home/Article

Now as I know we should always use lowercase in url but have not idea about technical reason. I would like to learn from you expert to clear this concept why to use lowercase in url. What are the advantages and disadvantages for upper case url.

Code Lover
  • 8,099
  • 20
  • 84
  • 154
  • some of the biggest websites on the web dont even follow or do this.. not really a something that is considered a best practice.. – Hugo May 23 '17 at 15:57
  • For reference: Google Webmaster Trends Analyst [John Mueller said](https://twitter.com/JohnMu/status/877952088030007297), "URLs are case-sensitive, but pick whatever case you want." – showdev Nov 19 '19 at 21:32
  • Does this answer your question? [Should URL be case sensitive?](https://stackoverflow.com/questions/7996919/should-url-be-case-sensitive) – TylerH May 28 '21 at 13:59

4 Answers4

52

The domain part is not case sensitive. GoOgLe.CoM works. You can add uppercase as you like, but normally there's not a reason to do so and, as stated in the comments below, may hurt your SEO ranking.

The path part is or is not case sensitive, depending on the server environment and server. Typically Windows machines are case insensitive, while Linux machines are case sensitive. This means that you should stick to lowercase or you risk introducing a bug that's really hard to hunt down (mismatched case that doesn't matter on the dev server).

The query string part is available to the server as it is. You can readily use mixed-case as you like, or discard the case (toLowerCase(...)). This also means that using a base64-encoded keys will work. You can't expect the users to type that correctly, though.

The hash part (called "fragment identifier") is only available to the client code, not to the server. Javascript may distinguish between the cases as it likes, and so does the browser. url#a will scroll to the element with the ID a, but url#A won't.

John Dvorak
  • 26,799
  • 13
  • 69
  • 83
  • Your answer has almost solved my confusion. However one thing more. So there is not such technical reason or SEO reason to avoid upper case or mixed case but to avoid such bug by case? – Code Lover Nov 22 '12 at 12:14
  • Some servers let you distinguish files or directories by case, some don't. That leaves you with very little reason to use case. I don't think SEO's an issue. I can't rule out a bug with some servers applying `toLowerCase` to any URL and then not finding the directory. Sounds unlikely, though. – John Dvorak Nov 22 '12 at 12:30
  • 7
    From an SEO persepctive you should use all lowercase as google will see www.domain.com/Home/Article and www.domain.com/home/article as two different pages which will dilute their search rankings. – oenpelli Jul 31 '13 at 23:35
  • 2
    "The path part is or is not case sensitive," - it's always case sensitive. `/Home`and `/home` are different URLs, no matter which server software. – Daniel W. Jun 15 '16 at 08:27
  • @DanFromGermany feel free to correct and clarify that part. The point is that in general Windows-based routers will ignore the casing of the path part. I've just tested it in our code. The ASP.net router, even though not 1:1 mapping to a file structure, will happily match a mixed case in path. – John Dvorak Jun 15 '16 at 09:01
  • 1
    The URL is always case-sensitive, but it may be treated as case-insensitive. Please read the http/html/url/uri specs https://www.w3.org/TR/WD-html40-970708/htmlweb.html – Daniel W. Jun 15 '16 at 09:45
  • 1
    The fragment identifer ("#hashtag") is not available to the server, as it is not part of the HTTP protocol. – Daniel W. Jun 15 '16 at 09:48
  • @DanFromGermany the fragment identifier is not available to the server (will fix) according to my testing, but where is that stated in the [RFC](https://tools.ietf.org/html/rfc7230)? The RFC does define that as a part of the URI. – John Dvorak Jun 15 '16 at 10:17
  • 1
    @JanDvorak The fragment identifier is part of the URI but not part of the HTTP Protocol. An URI is not only http://... it can also be irc://...etc. – Daniel W. Jun 15 '16 at 10:28
  • @DanFromGermany fixed. Is there any chance it's something that was changed in the past few years? – John Dvorak Jun 15 '16 at 11:24
  • @JanDvorak nope, it's been always like this :-) – Daniel W. Jun 15 '16 at 12:12
  • Just wondering if for special chars that are encoded if case also makes a difference and should always be lower case? I.e. `4%C2%BD-years` vs `4%c2%bd-years` which both when decoded mean `4½-years`. – radtek Nov 01 '19 at 22:26
  • @radtek usually the server code just throws the relevant bits of URL into its own equivalent of urldecode or the server does that for it, losing all capitalization information in the encoding. For the domain part, there's an entirely different encoding to be used instead, which uses lower-case only. – John Dvorak Nov 02 '19 at 04:24
  • @JohnDvorak actually django capitalizes only the encoded character part of the url. So if you request a url with a slug of `onE-4%c2%bd`, `oeE-4%C2%BD`, or `onE-4½`, the request object (`request.get_full_path()`) will store it as `oeE-4%C2%BD` .. so that leads me to assume any of those urls mean the same thing. And I'm hoping its not a problem for SSO allowing all those patterns as a canonical url. – radtek Nov 04 '19 at 16:48
18

I'm going to have to disagree with all established wisdom on this, so I'll probably get downvoted, but:

If you redirect all mixed case urls to your properly cased url, it solves all the problems mentioned. Therefore it seems this argument is coming from tradition and preference. The point of a URL is to have a user-friendly representation of a page, and if your url is friendlier with upper case, why not use it? Compare:

moviesforyoutowatch.com/batman-vii-the-dark-knight-whatevers MoviesForYouToWatch.com/Batman-VII-The-Dark-Knight-Whatevers

I find the mixed case version superior for the purpose. If there's a technical reason that can't be solved with a lower-case compare and redirect, please share it.

Dirigible
  • 1,749
  • 16
  • 11
  • 7
    The problem with mix case is social media. If you care about Facebook likes for example. Facebook shared url is case sensitive. If for some reason, someone shared your url in lowercase, that's a different URL. That is why the safe approach is to stick to all lowercase rather than mixed case. Besides, users don't look at a URL. Users only click links. – Ross Mar 07 '17 at 05:21
  • 3
    If for some reason, someone shared your url in UPPERCASE, that's a different URL. That is why the safe approach is to stick to all UPPERCASE rather than mixed case. – Gqqnbig Jan 21 '18 at 17:32
  • 2
    If for some reason, someone shared your url in KEBABCASE, that's a different URL. That is why the safe approach is to stick to all KEBABCASE rather than mixed case – Vad Sep 05 '18 at 20:15
  • 1
    If for some reason, someone shared your url in SNAKE_CASE, that's a different URL. That is why the safe approach is to stick to all SNAKE_CASE rather than mixed case – 无名小路 Jul 27 '21 at 13:33
  • 1
    If for some reason, someone shared your url in CamelCase, that's a different URL. That is why the safe approach is to stick to all CamelCase rather than mixed case – Dimi Ansari Aug 02 '21 at 07:12
  • If for some reason, someone shared your url in nUtCAsE, that's a different URL. That is why the safe approach is to stick to all nUtCAsE rather than mixed case – Conan Apr 05 '23 at 10:50
13

I know you asked for technical reasons but it's also worth considering this from a UX perspective.

Say you have a URL with upper case characters and, for arguments sake, this has been distributed on printed media. When a user comes to enter that URL into their browser they may well be compelled to match that case (or be forced to match the specified case if your web server is case sensitive) ultimately you are giving them more work to do as they have to consider case as well. After all, they don't know if your server is case sensitive or not and they may have experienced 404s from case sensitive web servers in the past.

If your server is case sensitive and you are using mixed case URLs you are giving more scope for the user to mistype the URL. Furthermore, say you have the URL www.example.com/Contact. It's easy to confuse an upper and lower case "c" (especially if it is copied in hand writing) if the user overlooks this and uses the wrong case they may never reach your content.

With all this in mind consider www.example.com/News/Articles/FreeIceCreamForAll. On keyboard that's not too difficult but consider this on a mobile device, it would be very fiddly to input.

The reverse is also true should a user want to write down a URL from the address bar. They may feel they need to match the case, ultimately giving them more work to do and increasing the likelyhood of errors.

To conclude; keep URLs lower case.

a gorsky
  • 309
  • 3
  • 5
-19

REGARDING SECURITY ASPECTS OF THIS ISSUE:

There is actually a good security reason to use a mix of uppercase and lowercase.

It has the effect of confusing and blocking attackers !

In human conversation humans get easily confused with uppercase and lowercase use.

Humans can't "speak" the word of the "identifiers or passwords or url's" with clarity if they contain uppercase and lowercase.

This helps with security on data or passwords on site sub-parts that are provided as part of a locked-in or secure sub-part of an "automated access" part of sites or their data.

It's similar to NOT USING JSON.

JSON is "human-readable text" and so JSON is simply giving all the attackers (Including Governments, Google .. who steal your ideas and data) ... almost everything they need to know about the data ... it's much more secure to confuse them by using private bespoke very-fast "binary protocols" - that use your own "unknowable data structures" ... but just watch out, because it is actually possible to confuse yourself or your own development team.

All your security layers and protocols have to be "well managed" to avoid confusion.

There is therefore an extra level of site and data security from human attackers (and some robots) to be had by simply using totally unconventional systems (i.e. why on earth would anybody want to use a "standard security protocol" when by some simple heavyweight prior computing they can all be easily broken).

Just "salt and hash" everything - plus also add some extra extra bespoke security of your own - it's just commonsense !

Conclusion: All the above answers are very clear and correct - but you can also happily leverage that very same knowledge to confuse potential attackers.

John Dvorak
  • 26,799
  • 13
  • 69
  • 83
Clive Williams
  • 103
  • 1
  • 8
  • 10
    Security through obscurity is poor security. Moreover, "attackers" are not going to use speech to communicate. Email is sooo much more reliable even if you disregard easier transmission of case. – John Dvorak May 08 '14 at 18:27
  • thanks a lot for the negative score (whaaaa) ... however will i stick to my guns and my answer because even GCHQ have historic military coded messages that they still CANNOT decode just sitting there in plain text from WW2 because >>> ALICE and BOB used "unique to them" encryption algorithms that were NOT standard and that they had pre-agreed and so even now today EVE (with all the power of GCHQ behind her) cannot decrypt their messages. – Clive Williams Dec 23 '16 at 22:01
  • 3
    URL are supposed to be readble by humans. If you consider users being able to access your site a security issue then don't publish on the Web. – sba Oct 30 '17 at 14:04
  • We will have to disagree as at Inferix Sentient AI the most important thing is that our mainly non-human (AI entities) have access, hence my reference to pre-agreed non-standard protocols. We need access for our most clued up humans (in Cheltenham) but NO ACCESS for humans that are not in group - so using complex rules & using non-standard protocols are best FOR US - So (for us) it's often about stopping access and making "human blind alleys" for access. Access straight into your mind or your workplace "hive mind" is something you will want to block - but sometimes allow "to those you trust" ! – Clive Williams Oct 31 '17 at 14:50
  • 2
    Original thinking and nice writeup. Technically I have to agree that 'security by obscurity' is lower form, so this is - simply :) - not the way to go. Leaving the key under the doormat is not the way for professional software solutions. But please leave this answer as this idea - though not trivial - might sprout up in others. – Bart Jan 19 '18 at 07:46