4

I have made a site where the files' name will be stored as MD5 hash.

If I'm not wrong then a collision is bound to occur after 2^128 times, ignoring the repetitions. I have found a thread on this here and I'm satisfied with the probability of collision. It's almost 0.

However, I thought instead of using 32(+1) byte, declaring the column in the DB table as VARCHAR(32), I can use as 8(+1) byte for the name to save space. The probability of collision is 1/2^32, ignoring the repetitions.

To test this I altered the column to VARCHAR(8), keeping all the records same, about 260 entries, and checked the DB size but there was no change in the size of DB.

The formula I'm using is :

$temp = mysql_query("SHOW TABLE STATUS");  
$size = 0;
while($row = mysql_fetch_array($temp)) {  
     $size += $row["Data_length"] + $row["Index_length"];  
}

Got help from here.

Even I tried with FOREIGN KEY reference in other tables where the files' names are being used but the size remained same as that of a table where I copied the full file name.

Now my question is where am I going wrong? Is it in the calculation of the DB size or in determining the size of the datatype of the column?

I'm not from a computer background and this is the first time I'm making a site so some points may sound silly to many. Sorry for that.

NewBee
  • 394
  • 3
  • 15
  • Let's say the disk storage, including indexes and other overhead, is 32 bytes for a `VARCHAR(8)` and 128 bytes for a `VARCHAR(32)`. There's a difference of 96 bytes per row between those two. Let's say your system has a million rows in it. That comes out to be a difference of 96 megabytes. AWS, for example, charges US$0.31 per year per gigabyte. So, using the longer data type costs you thirty cents a decade. Are you spending your time wisely? – O. Jones Aug 24 '17 at 11:35
  • This might be caused by fragmentation. Some operations on MySQL tables (like DELETE) fill the hard drive with blank "spaces" instead of actually deleting the data. You can check that by outputing the 'Data_Free' field from TABLE STATUS. Running OPTIMIZE TABLE on your table will then remove the empty spaces from the hard drive by recreating the table. I'm not 100% sure fragmentation happens when you're changing a field type though. – alpadev Aug 24 '17 at 11:39
  • @ O. Jones I agree with what you are saying but curious to know if I'm going wrong somewhere. I'm completely new to it and moreover, my country falls in Tire 4 category of all the ad networks so it's a bit difficult to fetch even $1. Thus little concerned about hosting cost but asked this mainly out of confusion & curiosity. – NewBee Aug 24 '17 at 12:08
  • @alpadev tried with OPTIMIZE TABLE database.table but it is showing `Table does not support optimize` and Status `OK`. Is there any other way? – NewBee Aug 24 '17 at 12:28

0 Answers0