2

Problem Background

I have a multimedia web applications where users can save and delete files. The saved files are used for various application wide functionalities.

When files are saved, they are put deep inside directory tree (usually 6 to 9 folders deep). The files are put at such deep level to make sure that they are easily distinguishable by admin/superuser looking at file structure and to match with existing manual system.

For Example,
"UploadedAssets" is the root directory and can contain following folder structure :
\ith\sp\cookery\ckpe\interactives\timestep\blahblah\item1\image01.png
\ith\sp\cookery\ckpe\interactives\timestep\blah_only\item1\image02.png
\ith\sp\cookery\ckpe\interactives\timestep\blah_only\item2\image03.png
\ith\sp\cookery\abcd\interactives\timestep\blahblah\item1\image66.png

  1. The first 5 names are user selectable (from dynamic dropdown menus) form approx 20-25 options each.
  2. The last 3 depends on whatever user inputs (e.g. category , title, etc.). This is received from user using text input fields. These can be 1 or up to 4 directories.
  3. The last directory "item1" can have one or more files but no sub directories
  4. Each directory EXCEPT the last one (item1) can have other sub directories
  5. When I remove/delete a file, obviously program knows the full path of the file location.

Because of number of possible directory name options, even in the testing phase, the root "UploadedAssets" directory has exploded and there are plenty of empty unused directories and dead branches.


My question is

Once the user deletes one/more files from a directory (e.g. image01.png from item1),

  1. How can I traverse up the tree from deleted file (going only to straight parents) and delete the parent node if its empty.
  2. While deleting if one of the directoy has other children/files then not to delete that directory and finish the process.
  3. While deleting stop at pre-defined root directory OR
  4. Stop after going UP n directory levels

E.g. in the above given example directory structure,
if user deletes image01.png then it should delete item1 and blahblah directories
if user deletes image02.png then it should delete only its parent item1 directory
if user deletes image66.png then it should delete all parent directories including abcd


My Attempts / research

I know how to remove single directory using php's rmdir. But couldn't think on how can I use it recursively to solve my problem.

I have tried to get my head around following stuff, but I don't know if any of those can fit my problem PHP: Unlink All Files Within A Directory, and then Deleting That Directory

Delete files then directory

http://php.net/manual/en/function.glob.php

https://stackoverflow.com/a/4490706/2337895

Community
  • 1
  • 1
Nis
  • 1,469
  • 15
  • 24
  • Use a combination of http://www.php.net/scandir, http://www.php.net/is_dir and http://www.php.net/basename – Scuzzy Mar 20 '14 at 04:53
  • @Scuzzy I have used http://www.php.net/basename to get the absolute path to the file. Once removing the file, i can check the last component in string and check if it a directory. THEN I can user this http://au1.php.net/manual/en/function.scandir.php#95913 func to see if there are any files , if not delete the dir and move up., but then I'll have to manually keep going up one level. How can I create a recursive function for the same? – Nis Mar 20 '14 at 05:07

1 Answers1

0

Because this has some overhead, I'm going to suggest that you only do this from PHP if absolutely needed. You can use the Linux find command for this:

find /path/to/dir -empty -type d -delete

I would put that in a cron job. If you like you can use the php system command to do this though:

system('find /path/to/dir -empty -type d -delete', $retval);

Using this you would simply delete the file and then let this run only every day or so to go through and take care of any empty directories. This may seem more hackish than making it all in PHP but it'll run much faster. It is less portable but that shouldn't matter too much for most sites. Save this as rdel.bat (I walways make sure you have the [Show hidden file extensions][2] explorer folder option turned on but if you don't then use the drop-down in your text editors Save As... dialog to ensure it has the proper extension).

UPDATE:

To do the same thing in Windows use a batch file with just this one line:

for /f "delims=" %%d in ('dir /s /b /ad ^| sort /r') do rd "%%d"

Test this by changing to a directory and running your new file. It should remove any empty directories below the current.

To schedule this just add an entry to your task scheduler. That is a little different depending on which version of Windows you use. Be sure to set the working directory correctly (this is how the batch file knows where to start).

The problem with this is that in Windows a lot of junk files get created in empty directories (ie Thumbs.db). You can handle that by adding code to your batch file to remove any of those files too:

del /s /q Thumbs.db

Add this above the other line and repeat for anything else which may be unneeded.

krowe
  • 2,129
  • 17
  • 19
  • Can you please explain how can I have major overhead to delete few directories once the file is deleted? I can even keep that process going on in the background without user experiencing it. – Nis Mar 20 '14 at 05:10
  • The overhead comes into play because of the multiple directory listings you'll need to perform using PHP. PHP is much slower than C++\ASM which is what something like `find` is using. In any language a recursive function is going to grow the stack which is also fairly slow. Depending on your site this may not add up to much overhead but this is code which has already been written perfectly fine so I'd just use it. You also don't need to maintain this code or load it with each page load which is also a plus. – krowe Mar 20 '14 at 05:16
  • I almost forgot the biggest speed benefit, which is that by using this method you ignore the empty directories all day then just once at the end of the day you go through and clean them up. IOW, this system reduces the CPU time by 100%. – krowe Mar 20 '14 at 05:22
  • hmmm.... Makes sense. How can I make sure that the system command is only run once a day at specific time (e.g. midnight) ? **more importantly** where do i put this system command in my php file? if I am not wrong, if no one accesses any page (e.g. at midnight) then php wont start and execute by itself. – Nis Mar 20 '14 at 05:23
  • I'm going to give you a link, it may look complicated but trust me it isn't and this is something that you'll use all of the time once you get the hang of scheduling in Linux: http://www.cyberciti.biz/faq/how-do-i-add-jobs-to-cron-under-linux-or-unix-oses/ – krowe Mar 20 '14 at 05:24
  • Thanks. It helped. Anything similar for windows server ? – Nis Mar 23 '14 at 23:43
  • Thank you for the windows version ! – Nis Mar 25 '14 at 03:46