0

I am trying to create a site where users can upload images, videos and other types of files. I did some research and people seem to suggest that saving the files as BLOB in database is a Bad idea; instead, save the file paths in database.

My questions are, if I save the file paths in a database:

1. How do I generate the file names?

I thought about computing the MD5 value of the file name, but what if two files have the same name? Adding the username and time-stamp etc. to file name? Does it even make sense?

2. What is the best directory structure?

If a user uploads images at 12/17/2013, 12/18/2018, can I just put it in user_ABC/images/, then create time-stamped sub-directories 20131217, 20131218 etc. ? What is the best structure for all these stuff?

3. How do all these come together?

It seems like maintaining this system is such a pain, because the file system manipulation scripts are tightly coupled with the database operations(may also need the worry about database transactions? Say in one transaction I updated the database but failed to modify the file system so I need to roll back my database?).

And I think this system doesn't scale (what if my machine runs out of hard disk so I need to upload the files to a second machine? What if my contents are on a cluster?)

I think my real question is:

4. Is there any existing framework/design pattern/db that handles this problem? 
What is the standard way of handling this kind of problems?

Thanks in advance for your answers.

Community
  • 1
  • 1
user3110379
  • 159
  • 1
  • 2
  • 8
  • Too much hassle for a simple thing? If you are so unsure why don't you go for a framework that handles it all for you. – marekful Dec 17 '13 at 09:34

2 Answers2

0

I've actually asked this same question when I was designing a social website for food chefs. I decided to store the url of the image in a MySQL database along with recipe. If you plan on storing multiple images for one recipe, in my example, maybe having a comma separated value would work. When the recipe loaded on the page, I would fetch the image associated with that recipe onto the screen.

  1. Since it was a hackathon and wasn't meant for production purposes, I didn't encode the file name into something unique. However, if I were developing for productional purposes, I would append the time-stamp to the media file name when storing it into the server and database/backend.

  2. I believe what I've proposed is the best data structure of handling this scenario. Storing the image onto the server is not only faster, but it should also take less space. I have found that when converting a standard jpg file of reasonable resolution to base64 encoding, the encoded text file representation took 30% more space. There is also the time of encoding the file and decoding the file for storage and resolving when using some BLOB type of data format instead of straight up storing the file on the server.

  3. Using some sort of backend server scripting like PHP, you'll be able to do some pretty neat stuff with the information you have available. Fetch the result from the database, and load it in from the page using HTML.

  4. As far as I know, there isn't a standard way of fetching media from a database yet. Perhaps there will be one day.

Dan Puzey
  • 33,626
  • 4
  • 73
  • 96
Charlie Le
  • 16
  • 1
0

There is not standard way to do that, it is different to the different application. The idea is you need generate a different Path+FileName for every upload, here is a way:

HashId = sha1(microsecond + random(1,1000000));
Path = /[user_id]/[HashId{0,2}]/[HashId{-2}];
FileName = HashId
ecco
  • 506
  • 1
  • 4
  • 6