5

I will have a website with a bunch of companies and each company will be able to upload their logo. Is it a good idea to just create a folder for each company who signs up, so it would be
companies/user1/logo.jpg and companies/user2/logo.jpg and just store everyone in a folder, that way I don't need the path to reference the image?

Or should I store them in one folder like company_logos/gaegha724252.jpg and they will all be random file names, and the path would be stored in the database associated with that company?

What are the advantages and disadvantages?

Thanks!

Drew
  • 6,736
  • 17
  • 64
  • 96

5 Answers5

11

Using Folders for Organization

Advantages: They are logically clear to someone fiddling with the system on the back end- that's about it really.

Disadvantages: 'harder' to clean up when you delete a company, etc. and you have to make sure none of your directory names overlap, generally more work from the get go.

Using Images in One Folder

Advantages It's technically a bit easier to clean up and not all that much work.

Disadvantages You'll have to write at minimum a very basic collision detection algorithm and a very basic 'random name generator'.

Using the Database to Store Images

Caution: Many lives have been lost in this argument!

Advantages: Referential integrity, backing up/restoring is simpler, categorization

Disadvantages: Fraught with pitfalls, potentially slower, more advanced storage/retrieval techniques, potential performance issues and increase of network requests. Also, most cheap hosting providers' databases are way too terrible for this to be a good idea.

I highly recommend just using a hashed file name and storing it (the filename) in the database and then storing the images in a folder (or many folders) on disk. This should be much easier in the long run and perform better in general without getting too complicated.

ashurexm
  • 6,209
  • 3
  • 45
  • 69
  • 1
    To clarify your last sentence, you're recommending storing _the hashed filename_ in the database and the file itself on the file system, correct? Also, I'd add as a disadvantage of storing all the files in one folder that if you end up with a very large number of files in that folder, listing the directory contents can become inordinately slow using traditional methods; see [here](http://www.olark.com/spw/2011/08/you-can-list-a-directory-with-8-million-files-but-not-with-ls/). – Mitch Lindgren Aug 18 '11 at 23:46
  • Thank you for a great answer. Now I understand a lot better. – Drew Aug 19 '11 at 01:25
  • @Mitch: You're correct, I would store the hashed filename in the database, not the file itself. Storing all the files in one folder could be a disadvantage, but we'd be talking millions before it was an issue. In that case, using dma_k's strategy with mine would (theoretically) yield good results. – ashurexm Aug 19 '11 at 04:56
6

I would go even further: calculate MD5 sum of each file before storing it to filesystem. You may use first two characters as the directory name of 1st level, next two characters as a directory of 2nd level:

vv 1st level
61f57fe906dffc16597b7e461e5fce6d.jpg
  ^^ 2nd level

As the a hashing algorithm has equal distribution, this will distribute your files equally among folders (the idea comes from how Squid organizes it's file cache). The server should return URL like this (e.g. no notion about directories):

http://server.com/images/61f57fe906dffc16597b7e461e5fce6d.jpg

and you may apply mod_rewrite to actually rewrite this url to something like this:

/storage/images/61/f5/7fe906dffc16597b7e461e5fce6d.jpg

This will also add some degree anonymity and hide the real image name. More over, if your clients will intend to upload the same contents, it will end up in the same file, which will save your disc space. Beware when removing the file from one client: it may also be used by others!

dma_k
  • 10,431
  • 16
  • 76
  • 128
1

store them as "company_logos/125.jpg", where 125 is an unique id (primary key in your database).

user187291
  • 53,363
  • 19
  • 95
  • 127
  • What if I allow them to upload jpgs and pngs, do I need to add a place in the database to store the logo extension? – Drew Aug 19 '11 at 17:18
1

Depending on how many companies you expect, creating a folder for each company could quickly get ridiculous. Also, reading a folder structure from disk will be much slower than reading from a database.

You could store the image location in the database, or you could use the ID solution. You could also store the image itself in the database if you wanted, using the "blob" type. Although other questions have tackled this issue: Storing Images in DB - Yea or Nay?

I think it would be best to either store the image name in the database, or use the ID method.

Community
  • 1
  • 1
Robin Winslow
  • 10,908
  • 8
  • 62
  • 91
  • If you store the location of the image in the DB, you aren't reading a folder structure, you're simply going straight to the image on the file system. Adding the DB into the mix adds an extra layer of mess and will almost always (barring very unlikely circumstances) be slower. – ashurexm Aug 18 '11 at 23:27
0

If it is going to be just a few hundred records or so, I wouldn't bother storing the pics outside the db.

Icarus
  • 63,293
  • 14
  • 100
  • 115
  • I disagree. Storing pics inside the db, regardless of how many are in there will always be more complicated to retrieve and most likely slower (if only a tiny bit) than storing them on the file system. On top of that, you'd probably have to implement some non-trivial caching system to reduce the number of database calls (imagining that the logo gets displayed on most, if not all pages a user is logged in to). – ashurexm Aug 18 '11 at 23:25