0

I'm using md5 with random code, current datetime and customer_id to generate a hash code but after I run 200,000 records I found there is many duplicated records. How can i avoid the duplicate?

while ($row = mysql_fetch_array($result, MYSQL_NUM)) {

$curr_date = date('Y-m-d H:i:s');
$hash = md5(rand(0,100000)+strtotime(date('curr_date'))+$row[0]);

echo $query = "update customers set email_hash='$hash' where customer_id='$row[0]'";
mysql_query($query) or die(mysql_error());
}
Jien Wai
  • 39
  • 2
  • 9
  • Hashes are not unique, so consider something else that avoids duplication issues. – Rowland Shaw Mar 03 '14 at 09:08
  • The concept of "random" without "duplicates" doesn’t make sense. While you can probably get better random distribution with fewer collisions, you still need to account for those collisions. If the distribution is good, you can probably get away with simply repeating the hash generation and UPDATE on the rare collision. (Also consider MD5(RAND()) in the SQL statement.) – danorton Mar 03 '14 at 19:15

4 Answers4

0

instead of using current date try using current time stamp that is time()

$hash = md5(rand(0,100000)+strtotime(time())+$row[0]);

Supriya Pansare
  • 519
  • 1
  • 4
  • 11
0

The problem is that the time could be the same for up to thousands of your entries, as you are only taking the time up to the seconds. You could try to use the milliseconds and nanoseconds as well, or sleep between two generations.

But in any way it would be safer to use another hashing algorithm with more bits spent to hash, that way collisions are less probable.

Theolodis
  • 4,977
  • 3
  • 34
  • 53
0

add the timestamp to your md5

$hash = md5(rand(0,100000)+strtotime(date('curr_date'))+$row[0])."_".time();
Loïc
  • 11,804
  • 1
  • 31
  • 49
0
  1. You're using date() wrong.

    date('c')
    
  2. You're adding as numbers instead of concatenating as strings.

    rand(0,100000) . strtotime(date('c')) . $row[0]
    

As it is you're using only a very small number of possible plaintexts for the hash. Fixing these two issues will reduce the number of collisions drastically.

Ignacio Vazquez-Abrams
  • 776,304
  • 153
  • 1,341
  • 1,358