0

I have a php script that downloads files from a folder off the document root. Here it is:

$getdir = $_GET['dir'];
$getdoctype = $_GET['doctype'];
$getfile = $_GET['filename'];
if ( !preg_match('/^[a-zA-Z]+[a-zA-Z0-9\s\_\-]+$/', urldecode($getdir)) ||
     !preg_match('/^[a-zA-Z]+[a-zA-Z0-9\s\_\-]+$/', urldecode($getdoctype))) {
    die('Bad parameter!');
}
$dir = "/var/www/uploads/$getdir/$getdoctype/";

$type = mime_content_type( $dir . $getfile );
if (file_exists($dir . $getfile)) {
header('Content-Type: ' . $type);
header('Content-Disposition: attachment;filename=' . $getfile);
readfile($dir . $getfile);
}
else{
echo "File Not Found";
}

The problem is alot of the time the files that are uploaded to my website have invalid characters like + # % () alot of these characters are ok locally but on the web they are interpreted as something else. Using my existing script how would I achieve properly escaping these characters so that my download works?

fixnode
  • 97
  • 3
  • 8
  • 32
  • possible duplicate of [How to encode the filename parameter of Content-Disposition header in HTTP?](http://stackoverflow.com/questions/93551/how-to-encode-the-filename-parameter-of-content-disposition-header-in-http) – DCoder Mar 11 '14 at 08:20

1 Answers1

1

You can get around having some special chars in the file downloads by wrapping the filename in quotes

header('Content-Disposition: attachment;filename="' . $getfile . '"');

Alternatively you could get regex to remove the special chars, although this is not the best option.

something like this would do it: Regular Expression for alphanumeric and underscores

To do this in regex you would use

$filteredName = preg_replace('/[^a-z0-9\.]/i', '', $filename);
Community
  • 1
  • 1
mic
  • 1,251
  • 2
  • 15
  • 33
  • The regex you gave replaces all characters and digits and does not work. The characters that most often show up are % # + underscores and the rest of the characters are fine in file names. – fixnode Mar 10 '14 at 17:49
  • is there anyway to escape the characters using HTML encode? replacing the characters using a regex is complicated because my code uses a multi file upload in php and json. – fixnode Mar 10 '14 at 18:32
  • Sorry i've been away from my computer since yesterday, You can use `urlencode($filename);` but this only changes the special chars into a `%20` style format. – mic Mar 11 '14 at 08:07
  • So I would have to apply the regex before the file is uploaded? How can I remove the special characters to existing files using the regex? – fixnode Mar 12 '14 at 03:26
  • You could do it on either the upload or download, by just doing it on the download you retain the integrity of the original filename but this would need to run every time the file is downloaded (more processing on a very busy site but still not that bad). You could do it on the upload if the file is going to be downloaded a lot, the filenames might not be the same as the original but it will not need filtering each time. – mic Mar 12 '14 at 08:21