11

Say we want to develop a photo site.

Would it be faster to upload or download images to or from MongoDB than store or download images from disk... Since mongoDB can save images and files in chunks and save metadata.

So for a photosharing website, would it be better (faster) to store the images on a mongodb or on a typical server harddisk. etc.

im thinking of using php, codeigniter btw if that changes the performance issues regarding the question.

Sirwan Qutbi
  • 199
  • 1
  • 7
  • 1
    store the files in the file-system and the details in the db –  Jun 29 '11 at 21:09
  • Storing images in the database is rarely a good idea, except for some minor "portability" plusses. Now you need to hit the DB every time someone wants to view an image. – Marc B Jun 29 '11 at 21:11
  • The more you do, the longer it takes. If the file is on disk, the webserver can directly serve it w/o even forking a script. If it's in the db, the webserver needs to fork a script, the script needs to connect to the db server, the db server needs to process the query, the script needs to read the db servers answer and process it to finally deliver the data. Still wondering what takes longer? – hakre Jun 29 '11 at 21:14
  • 4
    Well "faster" isn't always "better" there is a lot more to take into account that these comments aren't addressing. Things like managing your files long term (which can be a huge pain on a file system) backups, replication, meta data of the files (and also things like the md5 hash MongoDB will get you so you can avoid duplicates) etc, etc. A lot of sites use images in a database and it works quite well ... GridFS was designed precisely for doing this! – Justin Jenkins Jun 29 '11 at 21:36

4 Answers4

8

Lightweight web servers (lighttpd, nginx) do a pretty good job of serving content from the filesystem. Since the OS acts as a caching layer they typically serve content from memory which is very fast.

If you want to serve images from mongodb the web server has to run some sort of script (python, php, ruby... of course FCGI, you can't start a new process for each image), which has to fetch data from mongodb each time the image is requested. So it's going to be slow? The benefits are automatic replication and failover if you use replica sets. If you need this and clever enough to know to achieve it with FS then go with that option... If you need a very quick implementation that's reliable then mongodb might be a faster way to do that. But if your site is going to be popular sooner or later you have to switch to the FS implementation.

BTW: you can mix these two approaches, store the image in mongodb to get instant reliability and then replicate it to the FS of a couple of servers to gain speed.

Some test results.

Oh one more thing.. coupling the metadata with the image seems to be nice until you realize the generated HTML and the image download is going to be two separate HTTP requests, so you have to query mongo twice, once for the metadata and once for the image.

Karoly Horvath
  • 94,607
  • 11
  • 117
  • 176
  • "But if your site is going to be popular sooner or later you have to switch to the FS implementation." - [Nope](http://stackoverflow.com/a/5627933/230340). – Zippo Aug 07 '12 at 11:27
  • @Zippoxer: I guess this depends on what you mean by 'popular'. http://www.coffeepowered.net/2010/02/17/serving-files-out-of-gridfs/ running the images through PHP for 25k visitors might be ok, bit slow, but could be acceptable.. now if you have 10M or more users, that's a different story altogether. – Karoly Horvath Aug 07 '12 at 12:41
3

When to use GridFS for storing files with MongoDB - the document suggests you should. It also sounds fast and reliable, and is great for backups and replication. Hope that helps.

Community
  • 1
  • 1
Tak
  • 11,428
  • 5
  • 29
  • 48
3

Several benchmarks have shown MongoDB is approximately 6 times slower for file storage (via GridFS) versus using the regular old filesystem. (One compared apache, nginx, and mongo)

However, there are strong reasons to use MongoDB for file storage despite it being slower -- #1 free backup from Mongo's built-in sharding/replication. This is a HUGE time saver. #2 ease of admin, storing metadata, not having to worry about directories, permissions, etc. Also a HUGE time saver.

Our photo back-end was realized years ago in a huge gob of spaghetti code that did all kinds of stuff (check or create user dir, check or create date dirs, check for name collision, set perms), and a whole other mess did backups.

We've recently changed everything over to Mongo. In our experience, Mongo is a bit slower (it may be 6 times slower but it doesn't feel like 6 times slower), and anyway- so what? All that spaghetti is out the window, and the new Mongo+photo code is much smaller, tighter and logic simpler. Never going back to file system.

http://www.lightcubesolutions.com/blog/?p=209

FYA
  • 402
  • 4
  • 6
2

You definitely do not want to download images directly from MongoDB. Even going through GridFS will be (slightly) slower than from a simple file on disk. You shouldn't want to do it from disk either. Neither option is appropriate for delivering image content with high throughput. You'll always need a server-side caching layer for static content between your origin/source (be it mongo or the filesystem) and your users.

So what that in mind you are free to pick whatever works best for you, and MongoDB's GridFS provides quite a few features for free that you'd otherwise have to do yourself when you're working directly with files.

Remon van Vliet
  • 18,365
  • 3
  • 52
  • 57
  • 2
    there are several layers of caching between the filesystem and the users. most important are the browser's cache, proxies and the operating system. – Karoly Horvath Jun 29 '11 at 21:53