0

each user uploads images to my website using php. sometimes they spam the exact same image dozens of times which drives me nuts.

i am adding a column in the SQL db to add the md5 hash of every image they upload to prevent repeat uploads. let's call this column 'md5_images'.

when's a file is uploaded, i'd like json_decode all the previously user uploaded image md5s and if they are in_array(), i alert them that the image has already been uploaded by them (the logged in user). if it is a new image, i add the md5 hash to the end of the uploaded md5 image array and update the image_md5 column for that user and then upload their image.

my issue is that I cannot figure out an elegant wait to do this without adding each md5 hash with a space between them (e.g; md5_1 md_2 md5_3) and then turning that row of md5 s by explode() the column by spaces to get the array list to compare the newly uploaded user image md5 to.

$images = mysqli_query($conn, 'SELECT md5_images FROM table WHERE user = "example"');

if(mysqli_num_rows($images) > 0) {
    $images = mysqli_fetch_assoc($images);

    $check_md5s = json_decode($results['md5_examples'];
}

if(in_array(md5($file), $check_md5s)) { 
    //don't upload;
} else {
   foreach($results['md5_examples'] as $example) {
     $arr[] = $example;
   }
     
   $arr[] = md5($file);
   $update = json_encode($arr)(

   then update user column with new      
   $update of uploaded image md5 hashes 
   for later checks.
}

if the user has no images uploaded already, the obviously allow it to be uploaded and add the md5 hash to the list for later incase they want to upload more.

is there a simpler/more professional way to do this?

each user has their own set of files they can't reupload, it's not the entire website that refuses any duplicates whatsoever.

can anyone else me with this?

mov eax
  • 41
  • 5
  • 4
    is there any specific reason you've not created a table for hashed images? a table with user_id and image_hash for example? – Debuqer Apr 04 '23 at 09:42
  • 1
    Please read [Is storing a delimited list in a database column really that bad?](https://stackoverflow.com/questions/3653462/is-storing-a-delimited-list-in-a-database-column-really-that-bad). (Even if you plan to use a different format, the point is that storing multiple things in the same field is, in general, a very bad design choice in a relational database.) As Debuqer says, there should be a separate secondary table for this, with a foreign key back to the main one. That at least will simplify your task in terms of querying the data to check for duplicates. – ADyson Apr 04 '23 at 09:46
  • I agree, a separate table with a row per image hash will help you a great deal. However, if you really must store the hashes in a single value. MySQL's FIND_IN_SET() function might be a more efficient way of doing it https://www.w3schools.com/sql/func_mysql_find_in_set.asp – Rob Eyre Apr 04 '23 at 09:50
  • Since md5 always has a fixed length, you can simply search it from the column. `WHERE user = "example" AND md5_examples LIKE "%md5%"`. – shingo Apr 04 '23 at 09:53
  • different users may upload the same image but i cannot have a single user upload the same image more than once. that's why i want to keep track of md5 hashes of images uploaded by the user. otherwise i would set a whole table for md5 hashes to prevent all duplicate images from being uploaded. – mov eax Apr 04 '23 at 11:19
  • `that's why i want to keep track of md5 hashes of images uploaded by the user`...that's fine, we never said you shouldn't do that. We just suggested better ways to do it, primarily by correctly normalising your database. – ADyson Apr 04 '23 at 11:20
  • 1
    e.g. with the secondary table we suggested, you could just write `select * from imageHashes where hash = ? and userID = ?`, passing in the md5 hash of the upload, and the ID of the user, and see if that returns any rows. No need then to mess about with JSON or inelegant (and error-prone) comma-/space-separated values...there'd be a separate row in that table for each user/hash combo, making it trivial to query it as I've just shown. – ADyson Apr 04 '23 at 11:23
  • is there not a better way to grab the serialize()'d array from the images_column for that user, turn it into an array, check if md5 of file is is in it or not? i tried doing it this way and it's keep allowing the same file with the same md5 to be uploaded. i'm trying to keep it all it all on one table rather than creating another table with lots of entries. – mov eax Apr 05 '23 at 16:23
  • `a better way to grab the serialize()'d array from the images_column for that user, turn it into an array, check if md5 of file is is in it or not`...yeah you _could_, but that's massively inefficient compared to a simple SQL query. – ADyson Apr 05 '23 at 17:55
  • `i tried doing it this way and it's keep allowing the same file with the same md5 to be uploaded`...you must have missed something somewhere then in the way you're checking it, or how you're saving the data. The technique definitely works. – ADyson Apr 05 '23 at 17:56
  • `i'm trying to keep it all it all on one table`...no, no no. Bad design. Bad, bad, design. Read the link I provided above, it explains all the many, many reasons why that's a bad idea and will give you headaches forever and a day. Please...take the time to learn about proper database design _before_ trying to design a database! :-) – ADyson Apr 05 '23 at 17:56

0 Answers0