I'm working on converting a website using php. Part of my process is verify that image paths don't point to non-existant images (i.e. there are no broken images). Since many pages share certain images, I set up a cache array to see if I've already checked for the existence of an image file for a given path.
Using raw path string as the array index didn't work, so I used md5()
, and that does the trick. However, the conversion script is taking a long time, and it seems clear that that's because of the md5 calculation ( I've been running the conversion frequently over the past few days, and I noticed right away that as soon as my caching started working, the script took much longer to run.)
So I'm wondering if there is a faster hash algorithm that I can use in my cache, and of course I need one that won't produce collisions. Since this is a one-off script, I don't need a super-secure unbreakable algorithm, just one that gets the job done a little faster.
This comment apparently is a list of all the hashing functions that php has available to it.
Edit I didn't draw a lot of attention to this in my comment, but when I use the plain string of the path as the index for the cache array, it didn't work. As soon as I changed it to md5 hash, it worked. If I had more time I would troubleshoot this, but this is a one-off project that I can't spend more time than I absolutely must one.
Post Edit Okay, apparently I'm doing something way wrong with my caching; I must have changed something when I changed the indexes to hashes that caused the cache to start working, irrespective of the hashing. People are saying my hash should be okay with file path strings, and that md5s don't take that long anyway. So, I don't know what I'm doing wrong and I don't have time to figure it out in this project. I would delete this question but it already has answers.