1

I have a website on a shared server with some very basic php pages in the public_html directory, as well as some sub-directories with other pages in:

index.php
test.php
subdir1/index.php
subdir2/index.php

Looking at my visitor logs, I'm getting visits to index.php/some_text and index.php/some_other_text and so on. Naively I would expect those to receive an http status 404 as a) there is no directory called index.php and b) no files exist called some_text and some_other_text. However Apache is returning the file index.php with an http status 200.

Is there something I can set in .htaccess that will return a 404 status in these cases, without restricting the valid subdirectories?

I found some suggestions to set "DirectorySlash Off" but that made no difference. I've also tried

RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ - [R=404,L]

But that too made no difference. Thanks.

RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93
rbassett
  • 383
  • 3
  • 17
  • a URL which has index.php can't have `/` after it in general. So may be you are saying its a query string eg `:http://singh.test.com/test/index.php?etc_etc_bla_bla` like this? If yes then kindly do add some samples or URLs in your question to make it more clear, thank you. – RavinderSingh13 Dec 15 '21 at 19:14
  • 2
    @RavinderSingh13 "a URL which has index.php can't have `/` after it in general." - Yes it can, unless this has been explicitly disabled. The PHP handler permits this. However, if you were to try `index.html` instead, then it would be rejected. – MrWhite Dec 15 '21 at 19:28

2 Answers2

2

I'm getting visits to index.php/some_text and index.php/some_other_text and so on.

The part of the URL that starts with a slash and follows a physical file is called additional pathname information (or path-info). So, /some_text (in your example) is path-info.

In this case index.php receives the request and /some-text is passed to the script via the PATH_INFO environment variable (in PHP this is available in the $_SERVER['PATH_INFO'] superglobal).

By default, whether path-info is valid on the URL is dependent on the handler responsible for the request. PHP files allow path-info by default, but .html files do not. So, by default /index.html/some-text will result in a 404.

You can disable path-info by setting AcceptPathInfo Off in your Apache config / .htaccess file. By doing this, a request for /index.php/some-text will now result in a 404.

Conversely, if you set AcceptPathInfo On then /index.html/some-text will also be permitted.

Alternatively, you can use mod_rewrite in .htaccess to explicitly trigger a 404 for such URLs. For example, to target .php files (anywhere) only:

RewriteEngine On

RewriteRule \.php/ - [R=404]

Or, just .php files in the document root:

RewriteRule ^[^/]+\.php/ - [R=404]

Or, you can explicitly check the PATH_INFO server variable to block any URL that includes path-info. For example:

RewriteCond %{PATH_INFO} .
RewriteRule . - [R=404]

Note that some frameworks use path-info to route requests in a front-controller pattern (as opposed to using a query string or parsing the requested URI directly).

Reference:


I found some suggestions to set "DirectorySlash Off"

That has nothing to do with this issue. Setting DirectorySlash Off prevents mod_dir from appending trailing slashes to requests for directories.

MrWhite
  • 43,179
  • 8
  • 60
  • 84
  • Thank you for the very clear and useful answer, I was unaware of the additional pathname information. Unfortunately setting `AcceptPathInfo Off` in my .htaccess didn't cause a 404 to be returned. I made a test.php file and added ``, then called `mydomain.co.uk/test.php/asdasd` which returned /asdasd. Perhaps there is another way by doing some convoluted RewriteCond? – rbassett Dec 16 '21 at 15:36
  • 1
    @rbassett `AcceptPathInfo` may not work for a number of reasons... It could be disabled in the server config or a front-end proxy serves your static content or the request is overridden by other mod-rewrite directives. I've updated my answer with some alternative methods of blocking path-info with mod_rewrite. – MrWhite Feb 02 '22 at 17:41
1

I have since tried

RewriteEngine On
RewriteCond %{REQUEST_URI} ^/[^/]+\.php/.*$ 
RewriteRule ^(.*)$ - [R=404,L]

This will only then impact *.php files in the root directory, leaving any subdirectories alone. I think. It produces the behaviour I want but it doesn't feel like a good solution.

rbassett
  • 383
  • 3
  • 17
  • 1
    This is a reasonable solution for "blocking" these types of requests. However, your rule can be simplified. The `RewriteCond` directive is not necessary as this check can be made (more efficiently) in the `RewriteRule` directive itself. Likewise, there's no need for the capturing backreference in the `RewriteRule` directive. It could be _simplified_ to just a single directive: `RewriteRule ^[^/]+\.php/ - [R=404]` (as per the second example in my answer.) – MrWhite Feb 02 '22 at 17:48