4

I am implementing an image upload system in PHP, The following are required:

  • Have categories
  • Allow users to comment on images
  • Allow rating of images

For that, I have 2 approaches in mind:

1. Implement the categorization by folders

Each category will have its own folder, and PHP will detect categories via those folders.

Pros

  • Structured look, easily locatable images.
  • Use of native PHP functions to manipulate and collect information about folders and files

Cons

  • Multiple categorization is a pain
  • Need to save the full path in the database

2. Implement the categorization by database

Each image in the database will have a catID (or multiple catIDs), and PHP will query the database to get the images

Pros

  • Easily implemented multi-categories
  • Only image name is saved

Cons

  • Seems more messy
  • Need to query the database a lot.

Which do you think is better? Or is there a third, completely different, approach that I'm missing?

Just a note, I don't need code, I can implement that myself, I'm looking to find what to implement.

Would love to hear from you.

Community
  • 1
  • 1
Madara's Ghost
  • 172,118
  • 50
  • 264
  • 308
  • possible duplicate of [Effeciently storing user uploaded images on the file system](http://stackoverflow.com/questions/7203031/effeciently-storing-user-uploaded-images-on-the-file-system) – hakre Jan 23 '12 at 10:25
  • That's not entirely duplicate... but it's similar, I'll look at the answers there too, thanks. – Madara's Ghost Jan 23 '12 at 13:02

6 Answers6

3

I believe that the second option is better, a DB is giving you much more flexibility, and I think better performance then file system, if you set the right indexes.

In the filesystem approach you are limited to only 1 category per image, when in the DB you can set multiple categories on an image.

The con that Db is more messy, sorry I can't find a reason way in the db it will be more messy, maybe you mean that the files are not organized on the file system, but you still need to organize the files on the file system and divide them to multiple folders for better performance, and if you want to get all the images that have been uploaded you query the db for all of them, which will be much faster then ls on all the categories folders.
In organize the files in the file system when using the DB approach I mean that you need to divide them to several folders, actually it depends on how you predict the upload of the images will be:

  1. If you predict that the upload will be spread on long time then I think that better to put the files in directories per range on time(day, week, month) example if I upload an image now it will go to "/web_path/uploaded_photos/week4_2012/[some_generated_string].jpg"
  2. If you don't know how to predict the uploads, then I suggest you will divide the files into folders on something generic like the first two letters in MD5 hash on the image name, for example if my file name is "photo_2012.jpg" the hash will be "c02d73bb3219be105159ac8e38ebdac2" so the path in the files system will be "/web_path/uploaded_photos/c/0/[some_generated_string].jpg"

The second con that need to query the DB a lot is not quite true, cause you will need the same amount of queries on the file system which are far more slower.

Good luck.

PS Don't you forget to generate a new file name to any image that have been uploaded so there will be no collisions in different users uploaded same image name, or the same user.

Dolev
  • 86
  • 3
2

I'd be inclined to go with the database approach. You list the need to query the database a lot as a con, but that's what databases are built for. As you pointed out yourself, a hierarchical structure has serious limitations when it comes to items that fall into more than one category, and while you can use native PHP functions to navigate the tree, would that really be quicker or more efficient than running SQL queries?

Naturally the actual file data needs to go somewhere, and BLOBS are problematic to put it mildly, so I'd store the actual files in the filesystem, but all the data about the images (the metadata) would be better off in a database. The added flexibility the database gives you is worth the work involved.

GordonM
  • 31,179
  • 15
  • 87
  • 129
  • I see! But that would mean I'll have a huge folder with all of my images in it isn't it? That doesn't sound very... good... – Madara's Ghost Jan 23 '12 at 13:00
  • You could subdivide the folder in some arbitary way. For example, if the filename (or the title assigned on upload) of the image begins with A, put it in an A folder, if it's B then put it in a B folder, etc. Or you could subdivide by uploader, each user of the system getting their own directory. The possibilities are endless. – GordonM Jan 23 '12 at 13:29
2

The second solution (database) is actually a TAG/LABEL system of categorizing data. And that is the way to go, biggest examples being Gmail and Stackoverflow. Only thing you need to be careful about is how to model tags. If the tags are not normalized properly, querying from database becomes expensive.

shikhar
  • 2,431
  • 2
  • 19
  • 29
  • What do you mean by tags which are normalized properly? – Madara's Ghost Jan 23 '12 at 12:58
  • Many solutions preach and you would also be inclined to store tagid / catid as comma separated values but this would be not be normalized and will pose problems in advanced queries. – shikhar Jan 23 '12 at 13:28
1

Use folders only to make file storage reliable, storing certain amount of files per folder, i.e.

/b/e/beach001.jpg

as for your dilemma, it is not a question at all.
From your conditions you can say it yourself that database is the only solution.

Your Common Sense
  • 156,878
  • 40
  • 214
  • 345
  • I'd also recommend going with a folder structure that isn't connected to the catagories. What happens when you need to assign an image to multiple catagories? :) Use your database for storing the data, and your filesystem for storing the files. You shouldn't need to 'query the database a lot' with a decent query. – Leigh Jan 23 '12 at 11:03
  • Actually I thought I was reinforcing what you were saying. (I was going to make a similar answer - no point having two answers saying the same thing) – Leigh Jan 23 '12 at 11:14
  • Interesting! How does that improve my performance? Also, how does that make the file system more reliable and how does it help (purely for organization?) – Madara's Ghost Jan 23 '12 at 12:58
  • @Truth try to store 10000 files in one folder and let operation system to pick one or list them. Then reduce that number to 1000. Feel the performance improvement. – Your Common Sense Jan 23 '12 at 13:26
  • So by dividing it into folders I can reduce the amount of time the OS looks for the file? Very interesting. Thanks I'll try it – Madara's Ghost Jan 23 '12 at 13:37
1

Since you need a database to store comments and ratings, you should store categories in database as well. Sometime later you may also want to store image captions and description; database allows you to do that. And I would not worry about querying the database a lot.

Whether to store the image itself in database or filesystem is a separate issue which is discussed here.

Note about storing images in filesystem: do not store thousands of images in a single directory; it could cause performance issues for the OS. Instead invent a way to organize images in sub directories. You can group them by dates, filenames, randomly etc. Some conventions:

upload date: month/year

/uploaded_images
    /2010/01
    /2010/02

upload date: month-year

/uploaded_images
    /2010-01
    /2010-02

md5 hash of image name: first character

/uploaded_images
    /0/
    /1/
    .
    .
    .
    /e/
    /f/

batches of thousands

/uploaded_images
    /00001000/
    /00002000/
    /00003000/
Community
  • 1
  • 1
Salman A
  • 262,204
  • 82
  • 430
  • 521
  • No, no, the image itself will be definitely stored on the filesystem, no question about that, The question is how to structure it. – Madara's Ghost Jan 23 '12 at 13:05
0

I eventually went with the best answer of this question: Effeciently storing user uploaded images on the file system.

It works like a charm. Thanks for all of the answers!

Community
  • 1
  • 1
Madara's Ghost
  • 172,118
  • 50
  • 264
  • 308