Finally found the reason for this "polemic" when I searched (googled) the term rfc dot path
The problem with the dot .
in URLs
It's okay to use dots in url (even url-rewritten) such as:
http://example/project/hello-new-world
or assuming that we will create a url false
as:
http://example/project/index.php/hello-new-world.html
The problem occurs is when to use so:
http://example/project/test./
To the server /project/test./
and /project/test/
are the same thing, but it is visible that are not.
Note that the problem does NOT occur if you do this /project/.test/
, as there are files that start with dot only, like .htaccess
The reason the URLs rewritten not use dots to prevent this or facilitate the canonicalization of URLs (URL normalization).
A clearer example of the problem, create a file on your physical folder on localhost:
/var/www/images/test.jpg
Go to http: //localhost/images/test.jpg
and then try to access all of these:
http://localhost/images/test.jpg.
http://localhost/images/test.jpg...
http://localhost/images/test.jpg....
http://localhost/images/test.jpg.....
http://localhost/images/test.jpg......
http://localhost/images/test.jpg.......
All URLs are delivered to the client (web-browser for example) as image test.jpg
.
URL normalization (or URL canonicalization)
Normalization of URL (or URL canonicalization) is the process by which URLs are altered and standardized in a consistent manner. The objective of the standardization process is to turn a URL into a standard URL
or canonical so you can determine whether two different URLs can be syntactically equivalent.
Search engines use standardization URL in order to attach importance to web pages and reduce indexing of duplicate pages. Crawlers perform normalization URL in order to avoid tracking the same resource more than once.
Types of standardization (the following normalization are described by RFC 3986):
Removal of the directory index. Default directory indexes are generally not required in URLs:
http://www.example.com/a/index.html
→ http://www.example.com/a/
Replacing IP domain name. Verify that the IP address maps to a canonical domain name:
http://208.77.188.166/
→ http://www.example.com/
(something that helps it is the header Host: domain
)
Removing duplicate cutting paths which include two adjacent bars can be converted to a:
http://www.example.com/foo//bar.html
→ http://www.example.com/foo/bar.html
Removing or adding www
as the first domain label. Both urls often dot to as same pages:
http://www.example.com/
→ http://example.com/
Removing the ?
when the query is empty. When the query is empty, there may be no need for ?
:
http://www.example.com/display?
→ http://www.example.com/display
Add /
to the directories:
http://www.example.com/alice
→ http://www.example.com/alice/
(usually the server with Apache and Nginx already do redirection, if a real folder).
However, there is no way to know if a URL path component is a directory or not. RFC 3986 note that if the URL redirects to the previous URL example, then this is an indication that they are equivalent.
Removing segments dots (dot-segments). The segment ..
and .
It can be removed from a URL according to the algorithm described in RFC 3986:
http://www.example.com/../a/b/../c/./d.html
→ http://www.example.com/a/c/d.html
However, if a removed ..
component, e.g. b/..
, is a symlink to a directory with a different parent, eliding b/..
will result in a different path and URL. In rare cases depending on the web server, this may even be true for the root directory (e.g. //www.example.com/..
may not be equivalent to //www.example.com/
. (this is the likely reason to avoid .
)
Then you ask me: I must then avoid the dots in my rewrites URLs?
I say it is a solution, but not the only, if you are using mod_rewrite
is probably using a language like PHP by example and through this language you can detect if the URL has dots at the end, eg.:
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([a-zA-Z0-9\-\/.]+)$ index.php/$1 [QSA,L]
</IfModule>
This RewriteRule
generates the variable $_SERVER['PATH_INFO']
and you can compare is variable with the variable $_SERVER['REQUEST_URI']
both will be different. Or you can just use REQUEST_URI
combined with rtrim
to check and make a permanent redirect, eg.:
<?php
$req = rtrim($_SERVER['REQUEST_URI'], '/');//Remove / of the end of URL.
if ($req !== rtrim($req, '.')) {
header('X-PHP-Response-Code: 301', true, 301);
}
Sources: