1

I am creating files and setting it's names to be hashed representation of time() using md5 function:

$encoded_data = ['some_data'];
$file_name = md5(time()).'.json';
$path = base_path("../some_folder/");
file_put_contents($path.$file_name, $encoded_data); 

What I do not understand is if I use scandir with sorting order parameter to get those files:

foreach(array_diff(scandir($path, 1), ['.', '..']) as $file_name) {
    $files[] = base_path('../some_folder/').$file_name;
}

will $files array be really be sorted by date and time which is used as a file name?

nikname
  • 151
  • 1
  • 14
  • 1. Don't use `1`, use constant `SCANDIR_SORT_ASCENDING`. 2. Have you tried? Since your `time` is actually random chars from `md5`, then it should not be sorted properly, just sorted by actual file name – Justinas Jun 02 '20 at 14:15
  • I have tried but you can't really know since I can't decrypt md5 hash to check – nikname Jun 02 '20 at 14:43
  • After `$file_name = md5(time()).'.json';` put code `file_put_contents(__DIR__ . '/times.txt', $file_name . ' => ' . time() . PHP_EOL, FILE_APPEND)`. That way you will keep map of hash<->time – Justinas Jun 02 '20 at 15:10
  • That could be a solution, post it as answer if you want – nikname Jun 02 '20 at 15:33
  • That is no way as solution to your issue `order by hash decoded value` – Justinas Jun 02 '20 at 15:37
  • Well, that's cause the way I wanted to do this is probably not possible. I'll just leave question then, just in case someone have similar problem, so he can read your comment – nikname Jun 02 '20 at 15:57

1 Answers1

0

Since hashing function like md5 are one-way only, the filename is useless as a sorting criteria. If you want to keep track of the very same timestamp you used for generating the md5 value, you'd have to keep a hash:timestamp table on record. If you did that, you wouldn't need to run scandir to begin with -- you could simply read the file list from the reference table you've saved. (Assuming you keep it up to date with deleted files. Otherwise, it would show obsolete files.)

Is there a particular reason you need to use a md5-hash of the timestamp? Why not simply use the timestamp (with a prefix or otherwise) as the filename? Then you could simply sort alphabetically, ascending or descending, and have the files automatically in timewise order. This would be by far the simplest and most light-weight option.

If md5-hashes as file names is a must, and writing a reference table is not what you prefer, then you will have to loop through the files, or use usort, and check the date of the file's creation/modification (filemtime). You can find solutions in the answers to sort files by date in PHP. Be aware that this will lead to plenty more disk activity (even if the results are cached).

Markus AO
  • 4,771
  • 2
  • 18
  • 29
  • Second paragraph is mine. I just used 0 brain, I need to use md5-hash just to keep distinct names of files, but they will still stay distinct if I use md5 to hash data which is stored inside file for example, and just add `time()` prefix to it. Many thanks to you and to Justinas as well, who gave similar solution in comments on my question. – nikname Jun 03 '20 at 05:29
  • Obviously `md5(time())` will be no more unique than just `time()`. If you generate two files at the same second, they would land with the same md5 hash. Even `microtime()` wouldn't 100% guarantee this, nor would `uniqid()` (based on microtime), especially on Windows (where microtime moves in chunks, apparently). Would probably be fine if you don't write the files in a tight loop. Yes, using `time() . md5($data)` would certainly work (assuming `$data` is scalar, not an array). – Markus AO Jun 03 '20 at 18:07