75

About the system

I have URLs of this format in my project:-

http://project_name/browse_by_exam/type/tutor_search/keyword/class/new_search/1/search_exam/0/search_subject/0

Where keyword/class pair means search with "class" keyword.

I have a common index.php file which executes for every module in the project. There is only a rewrite rule to remove the index.php from URL:-

RewriteCond $1 !^(index\.php|resources|robots\.txt)
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php [L,QSA]

I am using urlencode() while preparing the search URL and urldecode() while reading the search URL.

Problem

Only the forward slash character is breaking URLs causing 404 page not found error. For example, if I search one/two the URL is

http://project_name/browse_by_exam/type/tutor_search/keyword/one%2Ftwo/new_search/1/search_exam/0/search_subject/0/page_sort/

How do I fix this? I need to keep index.php hidden in the URL. Otherwise, if that was not needed, there would have been no problem with forward slash and I could have used this URL:-

http://project_name/index.php?browse_by_exam/type/tutor_search/keyword/one
%2Ftwo/new_search/1/search_exam/0/search_subject/0
Captain Man
  • 6,997
  • 6
  • 48
  • 74
Sandeepan Nath
  • 9,966
  • 17
  • 86
  • 144
  • 1
    I feel it is best to have URLs like this:- `http://project_name/browse_by_exam?type/tutor_search/keyword/class %2Fnew/new_search/1/search_exam/0/search_subject/0` That way I get rid of the difficulty of readability caused by &param1=value1&param2=value2 convention and also I am able to allow forward slashes (now in the query string part by using `?`) I would avoid AllowEncodedSlashes because Bobince said `Also some tools or spiders might get confused by it. Although %2F to mean / in a path part is correct as per the standard, most of the web avoids it.` url .htaccess url-routing – Sandeepan Nath Jul 13 '10 at 11:26
  • 1
    you can use %2F if use using this way ?param1=value1&param2=value%2Fvalue but if you use /param1=value1/param2=value%2Fvalue it will throw an error. – Ahmad Nov 20 '11 at 15:02
  • Related:[Is a slash (“/”) equivalent to an encoded slash (“%2F”) in the path portion of an HTTP URL](http://stackoverflow.com/q/1957115/95735) – Piotr Dobrogost Dec 04 '12 at 16:11

13 Answers13

152

Apache denies all URLs with %2F in the path part, for security reasons: scripts can't normally (ie. without rewriting) tell the difference between %2F and / due to the PATH_INFO environment variable being automatically URL-decoded (which is stupid, but a long-standing part of the CGI specification so there's nothing can be done about it).

You can turn this feature off using the AllowEncodedSlashes directive, but note that other web servers will still disallow it (with no option to turn that off), and that other characters may also be taboo (eg. %5C), and that %00 in particular will always be blocked by both Apache and IIS. So if your application relied on being able to have %2F or other characters in a path part you'd be limiting your compatibility/deployment options.

I am using urlencode() while preparing the search URL

You should use rawurlencode(), not urlencode() for escaping path parts. urlencode() is misnamed, it is actually for application/x-www-form-urlencoded data such as in the query string or the body of a POST request, and not for other parts of the URL.

The difference is that + doesn't mean space in path parts. rawurlencode() will correctly produce %20 instead, which will work both in form-encoded data and other parts of the URL.

bobince
  • 528,062
  • 107
  • 651
  • 834
  • 4
    Ah, so that is why the slash is denied. Perfect diagnosis and treatment. – Pekka Jul 13 '10 at 08:29
  • 1
    +1 I tried explaining some of this in one of his other questions, but you did it far more coherently than I was able to. – Tim Stone Jul 13 '10 at 08:48
  • 6
    Hi Bobince, `rawurlencode()` too converts forward slashes to `%2F` which is still breaking my URL. I did not understand actually how `rawurlencode()` fix my problem. – Sandeepan Nath Jul 13 '10 at 09:51
  • 2
    It doesn't, that's a side-issue to do with `+` vs. `%20`. The fix is `AllowEncodedSlashes`, although relying on that reduces your deployment possibilities (ie. you can't deploy it on IIS, and other users—if there are any—won't be able to deploy it if they are using shared hosting with no access to the `httpd.conf`). Also some tools or spiders might get confused by it. Although `%2F` to mean `/` in a path part is correct as per the standard, most of the web avoids it. – bobince Jul 13 '10 at 09:58
  • ok I understood... thank you again for the explanation. But, why then in normal URLs of the following format encoded forward slashes i.e. %2F are allowed in the path part:- `http : // project_name/ index.php? type=tutor_search &keyword=one %2F two &new_search=1 &search_exam=0 &search_subject=0` By normal URLs I mean the default URL convention I started with in LAMP projects. – Sandeepan Nath Jul 13 '10 at 10:08
  • ok I got it! in the default LAMP URLs, encoded forward slashes are allowed in the query string part and I am trying to allow in the path part of my URL. – Sandeepan Nath Jul 13 '10 at 10:35
  • 1
    Yes, any sequence of encoded bytes must be allowed in the query string. Whilst any encoded byte is technically valid in a path component as per the URL RFC, servers have trouble with some of them due to the path part traditionally being used for filenames. Apart from `%00`, `%2F` and `%5C`, IIS will also give you trouble with non-ASCII byte sequences in the path that are not valid UTF-8 sequences. – bobince Jul 13 '10 at 12:40
  • @bobince Now one question remains. If there is no standard way of doing this, how are you supposed to do it? If someone has say some wikisoftware which might be deployed on different server software and where users can submit links into their content, surely when that data comes out of the database you want to garantee that your uri's are valid. What is the standard way of doing that? Manually convert back your forward slashes? –  Jan 23 '13 at 00:09
  • Interesting reading [When to Encode or Decode in RFC3986](https://tools.ietf.org/html/rfc3986#section-2.4). – koppor Jul 20 '17 at 22:46
12

Replace %2F with %252F after url encoding

PHP

function custom_http_build_query($query=array()){

    return str_replace('%2F','%252F', http_build_query($query));
}

Handle the request via htaccess

.htaccess

RewriteCond %{REQUEST_URI} ^(.*?)(%252F)(.*?)$ [NC]
RewriteRule . %1/%3 [R=301,L,NE]

Resources

http://www.leakon.com/archives/865

RafaSashi
  • 16,483
  • 8
  • 84
  • 94
5

In Apache, AllowEncodedSlashes On would prevent the request from being immediately rejected with a 404.

Just another idea on how to fix this.

Paul Jennings
  • 51
  • 1
  • 1
4
$encoded_url = str_replace('%2F', '/', urlencode($url));
MrWhite
  • 43,179
  • 8
  • 60
  • 84
CpnCrunch
  • 4,831
  • 1
  • 33
  • 31
4

I had the same problem with slash in url get param, in my case following php code works:

$value = "hello/world"
$value = str_replace('/', '/', $value;?>
$value = urlencode($value);?>
# $value is now hello%26%2347%3Bworld

I first replace the slash by html entity and then I do the url encoding.

Christian Michael
  • 2,128
  • 1
  • 19
  • 27
3

Here's my humble opinion. !!!! Don't !!!! change settings on the server to make your parameters work correctly. This is a time bomb waiting to happen someday when you change servers.

The best way I have found is to just convert the parameter to base 64 encoding. So in my case, I'm calling a php service from Angular and passing a parameter that could contain any value.

So my typescript code in the client looks like this:

    private encodeParameter(parm:string){
    if (!parm){
        return null;
    }
    return btoa(parm);
}

And to retrieve the parameter in php:

    $item_name = $request->getAttribute('item_name');
    $item_name = base64_decode($item_name); 
Jon Vote
  • 604
  • 5
  • 17
2

On my hosting account this problem was caused by a ModSecurity rule that was set for all accounts automatically. Upon my reporting this problem, their admin quickly removed this rule for my account.

Roman
  • 21
  • 1
  • 1
1

Use a different character and replace the slashes server side

e.g. Drupal.org uses %21 (the excalamation mark character !) to represent the slash in a url parameter.

Both of the links below work:

https://api.drupal.org/api/drupal/includes%21common.inc/7

https://api.drupal.org/api/drupal/includes!common.inc/7

If you're worried that the character may clash with a character in the parameter then use a combination of characters.

So your url would be http://project_name/browse_by_exam/type/tutor_search/keyword/one_-!two/new_search/1/search_exam/0/search_subject/0

change it out with js and convert it back to a slash server side.

chim
  • 8,407
  • 3
  • 52
  • 60
1

is simple for me use base64_encode

$term = base64_encode($term) 
$url = $youurl.'?term='.$term

after you decode the term

$term = base64_decode($['GET']['term'])

this way encode the "/" and "\"

0

A standard solution for this problem is to allow slashes by making the parameter that may contain slashes the last parameter in the url.

For a product code url you would then have...

mysite.com/product/details/PR12345/22

For a search term you'd have

http://project/search_exam/0/search_subject/0/keyword/Psychology/Management

(The keyword here is Psychology/Management)

It's not a massive amount of work to process the first "named" parameters then concat the remaining ones to be product code or keyword.

Some frameworks have this facility built in to their routing definitions.

This is not applicable to use case involving two parameters that my contain slashes.

chim
  • 8,407
  • 3
  • 52
  • 60
-1

I use javascript encodeURI() function for the URL part that has forward slashes that should be seen as characters instead of http address. Eg:

"/api/activites/" + encodeURI("?categorie=assemblage&nom=Manipulation/Finition")

see http://www.w3schools.com/tags/ref_urlencode.asp

Eva M
  • 625
  • 6
  • 11
  • the problem is with handling the URI after it is encoded to %2F - see accepted answer `Apache denies all URLs with %2F in the path part` – Jordan Oct 12 '17 at 12:56
-1

I solved this by using 2 custom functions like so:

function slash_replace($query){

    return str_replace('/','_', $query);
}

function slash_unreplace($query){

    return str_replace('_','/', $query);
}

So to encode I could call:

rawurlencode(slash_replace($param))

and to decode I could call

slash_unreplace(rawurldecode($param);

Cheers!

Excellence Ilesanmi
  • 3,295
  • 1
  • 18
  • 17
-3

You can use %2F if using it this way:
?param1=value1&param2=value%2Fvalue

but if you use /param1=value1/param2=value%2Fvalue it will throw an error.

MrWhite
  • 43,179
  • 8
  • 60
  • 84
Ahmad
  • 4,224
  • 8
  • 29
  • 40