41

I have a base path /whatever/foo/

and $_GET['path'] should be relative to it.

However how do I accomplish this (reading the directory), without allowing directory traversal?

eg.

/\.\.|\.\./

Will not filter properly.

NullPoiиteя
  • 56,591
  • 22
  • 125
  • 143
Johnny
  • 1,963
  • 4
  • 21
  • 24
  • I hope this question is totally academic. Just based on the fact that you have to ask I would say you shouldn't be allowing direct file system access based on user input. There are well maintained frameworks available that will give you this functionality without trying to roll it yourself. Don't do it without knowing exactly what your doing. – JSON Oct 07 '21 at 01:20

8 Answers8

128

Well, one option would be to compare the real paths:

$basepath = '/foo/bar/baz/';
$realBase = realpath($basepath);

$userpath = $basepath . $_GET['path'];
$realUserPath = realpath($userpath);

if ($realUserPath === false || strpos($realUserPath, $realBase) !== 0) {
    //Directory Traversal!
} else {
    //Good path!
}

Basically, realpath() will resolve the provided path to an actual hard physical path (resolving symlinks, .., ., /, //, etc)... So if the real user path does not start with the real base path, it is trying to do a traversal. Note that the output of realpath will not have any "virtual directories" such as . or .....

ircmaxell
  • 163,128
  • 34
  • 264
  • 314
  • 1
    Editor: strpos is multi-byte safe already. Introducing the mb alternative may introduce other vulnerabilities... – ircmaxell Feb 10 '14 at 18:42
  • 1
    What about symlinks? Or what if the file we want to check doesn't exist yet? (i.e. creating a new file in a prospective path). – Petah Feb 27 '14 at 00:03
  • 3
    @petah symlinks will be resolved by realpath to the canonical path. For files that dont exist, i doubt if it is a solvable problem, and i would advise not doing it in the first place (never allow users to specify new files directly)... – ircmaxell Feb 27 '14 at 00:05
  • @ircmaxell well as far as symlinks goes, it work fine for 1 level deep links, but it falls down on nested links. – Petah Feb 27 '14 at 00:21
  • 1
    Also in the sense of a user uploading files and create directories via a CMS, how would this be possible without the user specifying them? – Petah Feb 27 '14 at 00:21
  • 1
    I see one possible improvement. While this is safe, a malicious user could learn about the parent structure. For example, this would resolve to a good path: "?path=../../bar/baz/" – marcovtwout Mar 30 '15 at 20:50
  • 4
    what about new files for writing? realpath seems to return empty if the file doesn't exist. – Sufendy Apr 08 '15 at 03:46
  • Why do you use `strpos` instead of `if($realUserPath == $userPath) {..}` ? – Adam Mar 09 '17 at 12:29
  • For windows server : there are 2 update to do 1. replace "/" by DIRECTORY_SEPARATOR 2. directory name are same on lowercase : strtolower only for windows ? – Denis Chenu Dec 02 '22 at 10:58
18

ircmaxell's answer wasn't fully correct. I've seen that solution in several snippets but it has a bug which is related to the output of realpath(). The realpath() function removes the trailing directory separator, so imagine two contiguous directories such as:

/foo/bar/baz/

/foo/bar/baz_baz/

As realpath() would remove the last directory separator, your method would return "good path" if $_GET['path'] was equal to "../baz_baz" as it would be something like

strpos("/foo/bar/baz_baz", "/foo/bar/baz")

Maybe:

$basepath = '/foo/bar/baz/';
$realBase = realpath($basepath);

$userpath = $basepath . $_GET['path'];
$realUserPath = realpath($userpath);

if ($realUserPath === false || strcmp($realUserPath, $realBase) !== 0 || strpos($realUserPath, $realBase . DIRECTORY_SEPARATOR) !== 0) {
    //Directory Traversal!
} else {
    //Good path!
}
Cave Johnson
  • 6,499
  • 5
  • 38
  • 57
  • Just checking `($realUserPath === false || strcmp($realUserPath, $realBase . DIRECTORY_SEPARATOR) !== 0)` would work as well. – Goozak Jul 19 '21 at 22:12
6

It is not sufficient to check for patterns like ../ or the likes. Take "../" for instance which URI encodes to "%2e%2e%2f". If your pattern check happens before a decode, you would miss this traversal attempt. There are some other tricks hackers can do to circumvent a pattern checker especially when using encoded strings.

I've had the most success stopping these by canonicalizing any path string to its absolute path using something like realpath() as ircmaxwell suggests. Only then do I begin checking for traversal attacks by matching them against a base path I've predefined.

Cowlby
  • 651
  • 7
  • 16
1

You may be tempted to try and use regex to remove all ../s but there are some nice functions built into PHP that will do a much better job:

$page = basename(realpath($_GET));

basename - strips out all directory information from the path e.g. ../pages/about.php would become about.php

realpath - returns a full path to the file e.g. about.php would become /home/www/pages/about.php, but only if the file exists.

Combined they return just the files name but only if the file exists.

stollr
  • 6,534
  • 4
  • 43
  • 59
L4m0r
  • 97
  • 1
  • 2
0

When looking into the creation of new files or folders, I've figured I can use a two stage approach:

First check for traversal attempts using a custom implementation of a realpath() like function, which however works for arbitrary paths, not just existing files. There's a good starting point here. Extend it with urldecode() and whatever else you think may worth checking.

Now using this crude method you can filter out some traversal attempts, but it may be possible that you miss some hackish combination of special characters, symlinks, escaping sequences etc. But since you know for sure the target file does not exist (check using file_exists) noone can overwrite anything. The worst case scenario would be that someone can get your code creating a file or folder somewhere, which may be an acceptable risk in most cases, provided your code does not allow them to write into that file/folder straight away.

Finally so the path now points to an existing location, therefore you can now do the proper check using the methods suggested above utilising realpath(). If at this point it turns out a traversal has happened, you are still safe more or less, as long as you make sure to prevent any attempts writing into the target path. Also right now you can delete the target file/dir and say it was a traversal attempt.

I'm not saying it cannot be hacked, since after all still it may allow illegitimate changes to be done to the FS, but still better than only doing custom checks, that cannot utilise realpath(), and the window for abuse left open by making a temporary and empty file or folder somewhere is lower, than allowing them to make it permanent and even write into it, as it would happen with only a custom check that may miss some edge cases.

Also correct me if I'm wrong pls!

Bence Szalai
  • 768
  • 8
  • 20
0

I have written a function to check for traversal:

function isTraversal($basePath, $fileName)
{
    if (strpos(urldecode($fileName), '..') !== false)
        return true;
    $realBase = realpath($basePath);
    $userPath = $basePath.$fileName;
    $realUserPath = realpath($userPath);
    while ($realUserPath === false)
    {
        $userPath = dirname($userPath);
        $realUserPath = realpath($userPath);
    }
    return strpos($realUserPath, $realBase) !== 0;
}

This line alone if (strpos(urldecode($fileName), '..') !== false) should be enough to prevent traversal, however, there are many different ways hackers can traverse directories so its better to make sure the user starts with the real base path.

Just checking the user starts with the real base path is not enough because a hacker could traverse to the current directory and discover the directory structure.

The while allows the code to work when $fileName does not exist.

Dan Bray
  • 7,242
  • 3
  • 52
  • 70
0

In my version, I replaced $_GET['file'] with $_SERVER['REQUEST_URI'] to get the requested URI from the server variables. I then used parse_url() with the PHP_URL_PATH constant to extract the path component from the URI, excluding any query parameters.

The rest of the script performs path normalization, validating the file path against the base directory, checking file existence, and serving the file to the user.

By using $_SERVER['REQUEST_URI'], you can handle the file path extraction from the URL even when the file parameter is not explicitly set as a query parameter.

$baseDirectory = '/path/to/file/'; // Define the base directory where the file(s) is/are located

if ($_SERVER['HTTP_HOST'] == 'localhost') { 
    $baseDirectory = 'C:\xampp\htdocs'; // For localhost use
}

// Check if the 'REQUEST_URI' is set in the server variables
if (!isset($_SERVER['REQUEST_URI'])) {
    $this->logger->logEvent('REQUEST_URI NOT SET: '.__LINE__.' '.__FILE__);
    die('Invalid file request');
}

// Get the requested URI from the server variables
$requestedUri = $_SERVER['REQUEST_URI'];

// Extract the file path from the requested URI
$requestedFile = parse_url($requestedUri, PHP_URL_PATH);

// Normalize the file path to remove any relative components
$requestedFile = realpath($baseDirectory . $requestedFile);

// Check if the normalized file path starts with the base directory
if (strpos($requestedFile, $baseDirectory) !== 0) {
    $this->logger->logEvent('Directory Traversal: '.$requestedUri);
    die('Invalid file path');
}

// Check if the requested file exists
if (!file_exists($requestedFile)) {
    $this->logger->logEvent($requestedFile. ' not found: '.__LINE__.' '.__FILE__);
    die('File not found');
}
// Serve the file to the user
-3

1

put a null index.htm for -Index block

2

filter sQS on start

// Path Traversal Attack
if( strpos($_SERVER["QUERY_STRING"], "../") ){
    exit("P.T.A. B-(");
}
Community
  • 1
  • 1
Lo Vega
  • 121
  • 1
  • 3