3

I'm working on a PHP script to automatically decompress and scan tar.gz archives located on an external USB drive, and I've run into a couple of strange problems. Once the script locates an archive, it attempts to create a temporary directory on the USB drive to decompress the archive to. To guarantee a unique temporary directory name (because this script may be running multiple times concurrently), I name the file using a random 5-digit portion of the current time's MD5, like so:

$temp = "/media/$driveName/".substr(md5(microtime()), rand(0, 26), 5);

Because USB drives are not mounted in the server file system, I can't use PHP's built in file management commands (mkdir(), glob(), etc.), and instead can only interact with the file system using terminal commands executed with exec() or shell_exec(). So, to actually create the temporary directory with the above name, I use a basic terminal mkdir command:

shell_exec("mkdir $temp");

Next, I extract the archive to the temporary directory:

shell_exec("tar -xzf $archivePath -C $temp");

Finally, after I've finished analyzing the archive, I delete the temporary directory and its contents:

shell_exec("rm -rf $temp");

However, I'm running into two strange problems:

  1. Occasionally, maybe one out of every five runs, the temporary directory is created with a corrupted name. When displayed using ls directory, it looks something like 06191v??????.???v???. However, when using the tab key to automatically fill in a cd command, it fills in a much longer string, something like 06191v\342\226\200\342\225\232.\342\211\2100v\342\226\200/. I know that the string is fine immediately before passing it to shell_exec() (I've even attempted to use substr() to limit it to the correct length, but to no avail), and that the series of characters following the first slash (so starting at 342) is always the same, regardless of what the initial five characters are (the character in between the initial five and the slash is always a letter but is otherwise random). Other than that, however, I'm at a loss.

  2. For some reason, the shell_exec("rm -rf $temp"); I'm using to delete the temporary directory at the end of the script only works if I execute it twice. If I execute it just once, I get the following error: rm: cannot remove '/media/FILESYSTEM/f3637': Directory not empty, which is weird in and of itself because -rf should be overriding this error. This behavior can be duplicated in the server's command prompt as well.

I can't seem to find anything related to either of these problems online, so I'm hoping that it's just me making a dumb mistake and not a problem with my installation or USB drive. Thanks in advance for any help!

Edit: I actually have a third problem that is probably related to the first. After running the shell_exec("tar -xzf") command to decompress my archive, the resulting directory always has a few strange characters appended to it. For example, for one run with a compressed directory called BOOT, the decompressed directory shows up in the terminal as BOOT??w??????.???w??? and its echo output shows up as BOOT²w▀╚.∙w▀. Upon using tab to fill in the cd command into this directory, I get a string with an almost identical pattern: BOOT\302\262w\342\226\200\342\225\232 2.\342\210\231w\342\226\200/. 99% of the time, this doesn't affect the execution of the script, but occasionally it causes the script to fail to scan the directory.

NinjaBlob
  • 111
  • 6

2 Answers2

0

For #1, seems like some weird characters getting in there.

Tried this on my machine and it seemed to work:

<?php

for($x = 0; $x < 30 ; $x++)
    shell_exec('mkdir ' . substr(md5(microtime()), rand(0, 26), 5));
    sleep(1);

I separated the variable from inside the quotes and added back in with '.'

-Ken

Ken Koch
  • 426
  • 3
  • 12
  • The issue with this is that I need to be able to save the name of the temporary directory to a variable (I could find it using terminal commands but that seems unnecessarily roundabout). I have tried it like this: `shell_exec("mkdir " . $temp);`but it didn't seem to make a difference. – NinjaBlob Jun 25 '13 at 17:42
  • can you just echo the $temp variable out? my guess is some non-ascii characters are getting in there. – Ken Koch Jun 25 '13 at 17:47
  • I tried, but I still had the same problem: `ob_start(); echo $temp; $temp = ob_get_clean();` – NinjaBlob Jun 25 '13 at 17:53
  • what was the output? thats what i was looking for, try using something like this on the name of the temp file: http://stackoverflow.com/a/8781968/1234011 EDIT: if your on windows maybe this answer: http://stackoverflow.com/a/8782114/1234011 – Ken Koch Jun 25 '13 at 17:57
  • Sorry for the misunderstanding! Here's some output: `/media/FILESYSTEM/1a8b7`, `/media/FILESYSTEM/df32b` However, the output is like this even without the script you linked to; the value of $temp always shows up fine when I echo it out, even if it's not once I pass it into `shell_exec()`. – NinjaBlob Jun 25 '13 at 18:02
  • hmm are you on windows? thats the only other thing i can think of, it seems to be working for me on my machine, created > 100 directories – Ken Koch Jun 25 '13 at 18:07
  • The server is hosted on a custom Linux kernel; I'm accessing it through Windows, but the script itself should be executing entirely on Linux. Also, I added some information to my original post. – NinjaBlob Jun 25 '13 at 18:11
0

It turns out that all three problems had the same solution: make the temporary archive somewhere besides the USB drive. I have no idea what was causing the extra characters, but they went away once and for all when I moved the temporary directory location to Linux's /tmp directory.

NinjaBlob
  • 111
  • 6