0

We have ~20000 folders stored on our LAN in a master folder I'll call "folder" these are subdivided into groups of 1000 in folders named 1000-1999, 2000-2999 etc... if a folder exists - say 5416, I want to return a link to it.

My question is:

Originally I was spinning through the items in "folder", if they were a directory, I exploded the current folder name to obtain lower and upper bounds, compared whether the folder I was searching for (5416) fell within the upper and lower bounds of the current folder and checked to see if 5416 existed within it.

I'm not having any performance issues but it occurred to me that maybe exploding the strings and doing the comparisons was more computationally expensive than just using PHP's file_exists.

Thoughts?

user294382
  • 1,709
  • 3
  • 10
  • 8
  • 1
    If I was in your position I'd throw together a quick benchmark and see what the results were. My gut tells me the difference will be rather minimal. – Mike Nov 01 '13 at 16:38
  • First do it, then do it right, then do it better. But it has a cost : time. – qdelettre Nov 01 '13 at 16:39
  • If the folders are all consistent with the naming, it might be faster to just replace the `416` in `5416` with `000` (simple to do by converting to an integer and subtracting `5416 % 1000`), convert back to string and append `-5999` and then just compare the name of each folder to that value. Something along the lines of `$folderToFind = '5416'; $tmp = intval($folderToFind); $tmp = $tmp - ($tmp % 1000); $bounds = ''.$tmp.'-'.($tmp + 999);` – jonhopkins Nov 01 '13 at 16:41
  • You could time which of the options is faster by using one of the methods mentioned in this question: http://stackoverflow.com/questions/21133/simplest-way-to-profile-a-php-script – Joren Nov 01 '13 at 16:42
  • Thanks for the comments/answers. Mike, I should run a benchmark, though I'm new at this and suspect my benchmark itself will contaminate the result. Timestamp before/after worst case- folder = 20000+ search? jonh - I see what you're saying but don't think what your suggesting would improve things measurably unless everything was in RAM. – user294382 Nov 01 '13 at 21:27

1 Answers1

0

file_exists() should be a very inexpensive operation. Note too that file_exists builds its own cache to help with performance.

See: http://php.net/manual/en/function.file-exists.php

From: file_exists() is too slow in PHP. Can anyone suggest a faster alternative?

Community
  • 1
  • 1
CJ Ratliff
  • 333
  • 1
  • 2
  • 11
  • CJ - I think the biggest computational cost will come in the # of calls across my LAN but I should test – though I'm sure you're right in the strictest sense - the folks that wrote the std functions surely made them more efficient than what I could write.. hence my ?'s – user294382 Nov 01 '13 at 21:29