3

Thanks for all your help - see blow original question, and my edit following the two line rules (I cannot yet answer my own questions as a new user..).

I've looked around and question (here) almost meets the aims of my question:

For example, I'm creating a user data directory for each user on a web application; the folder of-course must be unique but also abstract for security (using their user id for example, would not be appropriate.

So far I've created the following function; it generates a unique folder name, checks to make sure it doesn't already exist and assigns it to a variable. It then loops back if the dir already exists:

function generate_unique_userDirectory(){
    $userDirectory = md5(uniqid($uid)); //Generate a unique folder name.
    if (is_dir($userDirectory)) {
        return FALSE; //If the dir exists, report so
    } else {
        return $userDirectory; //Return unique foldername
    }
}

While loop is used to keep going until an unused folder name is found.

while (!$userDirectory = generate_unique_userDirectory()) {
    echo 'folder exists...loop back try another';
    //Try another:
    $userDirectory = generate_unique_userDirectory();
}

Is there a better way of doing this, my main concern is whether I'm over-complicating the procedure?

Many thanks for your time.


My findings thanks to all that contributed!

Thanks @Veger, and everyone else; your assistance was brilliant; I've since re-worked the function based on your advice:

function generate_unique_userDirectory($uid){

        $userDirectory = md5(uniqid($uid));             

            while (is_dir(BASE_URI . "$userDirectory")){

                $userDirectory = md5(uniqid($uid));
            }

            return $userDirectory;

        }//End of generate_unique_userDirectory funcation decleration.

            //Example call to function:
        $userDirectory = generate_unique_userDirectory($uid);

            echo "The generated user directory is: $userDirectory";

As suggested, I've put the folder name generation while loop within the function which now makes the function call much simpler.

In response to your second bullet point @Veger, it is my understanding that as I've fed the 'uniqid' function to the md5 function this will result in a new string each time (though I may have misunderstood).

The purpose of passing the $uid to generate_unique_userDirectory() is to further 'salt' the generated string, however, I may be taking it a step too far!

Many thanks to all- a great first time on stackoverflow...

Community
  • 1
  • 1
user885983
  • 458
  • 4
  • 9
  • If you generate an unique folder name, it shouldn't exist. Else your generation is wrong? I assume you use something like the auto_increment from an database or something? – Wesley van Opdorp Aug 09 '11 at 13:27
  • Offtopic: Folders are generally not scalable. For example, ext3 has a limit of 31998 sub-directories per one directory. – Sukumar Aug 09 '11 at 13:33
  • @Sukumar: Do you know of any good strategies that would fit this context (user directories)? – James P. Aug 09 '11 at 13:43
  • http://stackoverflow.com/questions/191845/how-to-store-images-in-your-filesystem might contain some useful info – Jacco Aug 09 '11 at 13:52
  • 1
    @James, generally multiple hierarchies is enough for most people (though it is a nightmare to backup). For very large scale, see http://stackoverflow.com/questions/564223/amazon-s3-architecture – Sukumar Aug 09 '11 at 14:11

5 Answers5

2

The sensible solution would be to use the primary key or some other unique value. Since that is not an option using uniqid (collisions should be inexistant in practise) would make sense but you could also generate a hash from unique keys for a given user and perhaps use that for a lookup mecanism.

$userDir = "/users/" . md5( $user->id . $user->username . $user->email );

If you want to be extra careful, make sure you don't expose the used data anywhere so the path can't be reconstituted. You can also add a level of security by using url rewriting to redirect requests with certain arguments to a PHP script. That way you can do things like check sessions, IPs and so on.

James P.
  • 19,313
  • 27
  • 97
  • 155
  • Why do you need `md5()`? `$user->id` results in an unique user id already, right? – Veger Aug 09 '11 at 14:00
  • Sure, this is what is used on a site I'm working on. The OP says this is a requirement though: "the folder of-course must be unique but also abstract for security (using their user id for example, would not be appropriate." – James P. Aug 09 '11 at 14:05
  • Oh... did not see that, must have had something in my eye! – Veger Aug 09 '11 at 14:09
  • 1
    Careful reading is a reflex you get when reading requirements :) – James P. Aug 09 '11 at 15:06
1

Some of my thoughts on your design:

  • As Wesley already commented generate_unique_userDirectory() suggests that the returned directory is unique. So you should not need to check after calling whether its result is a unique directory or not...
  • Calling the function a second time results in the same failure, as $uid is not changed for the new user.
  • If it does change, why not put your while loop inside the function? So it always returns a unique directory.
  • By calling the function in the while-statement and in the loop itself, results in it being called twice after failure. First call in the loop which is overridden by the call in the statement.
Veger
  • 37,240
  • 11
  • 105
  • 116
  • uniq_id shouldn't lead to collisions. I'd be curious to know in what conditions that would hold in practice. Btw, I think using uniqid(null,true) generates 23 chars which is more complex. – James P. Aug 09 '11 at 13:55
  • @Jamers: Heh we had the same thought at the same time it seems... :) – Veger Aug 09 '11 at 13:57
  • Removed my last thought again, as it was stated as a requirement (using `md5()` or something similar) for security reasons. – Veger Aug 09 '11 at 14:11
1

My suggestions:

To achieve unique folders:

Just create folders with md5 of the user_id.

To achieve security:

Place the folders offline, not on the web path and create a class or script to access each of the file resources.

Scalability:

By having a class or script to access to users files you can later use it as bridge to switch to another storage mechanism like Amazon S3 or similar cloud services.

Orlando
  • 368
  • 2
  • 9
  • You have to careful about the user_id not being exposed anywhere. If it is it would be a cinch to reconstitute the path using md5. Placing files outside the www root is a good idea and so is having a middle-man approach. – James P. Aug 09 '11 at 13:52
  • 1
    Yes. It's a good point. The user folder generation could be enhanced and secured by using a different approach like a private key or similar. – Orlando Aug 09 '11 at 14:00
  • Thanks for this, I agree though admit I'm unsure of how to go about this (other than posting header content type etc?) but that's for another question! :) – user885983 Aug 09 '11 at 15:14
  • I would think you are referring to the wrapper to access and save resources. If that's the case, I was referring to having a Storage class, with a get, save, remove, etc. methods, in order to save files and get them. If you need an image for example, create a page that uses this get/save methods, at this page you can set Content headers and send the file content. – Orlando Aug 09 '11 at 19:08
0

imho, it's normal aproach. there will be mostly one iteration since md5 has too low possiblity of collision. one think i don't like in your code:

if (is_dir($userDirectory)) {
    return FALSE; //If the dir exists, report so
} else {
    return $userDirectory; //Return unique foldername
}

there might exists regular file with the same name. it's better to check with file_exists()

heximal
  • 10,327
  • 5
  • 46
  • 69
0
   function generate_unique_userDirectory($uid){
        $userDirectory = md5(uniqid($uid.time())); //Generate a unique folder name.
        if (is_dir($userDirectory)) {
            return generate_unique_userDirectory($uid); //If the dir exists, report so
        } else {
            return $userDirectory; //Return unique foldername
        }
    }

Why not make your function recursive? If the generated directory is already in use, the script will keep on trying untill it generates an unique directory. This should be the result of the function, hence the name ;-)

ChrisH
  • 1,283
  • 1
  • 9
  • 22
  • Recursive? That is a waste of your resources/stack! Better use a `while`-loop which exists after a unique directory is found. – Veger Aug 09 '11 at 13:53