5

I'm writing an app which will store a large number of image (and possibly video) files. After they're uploaded they will be immediately pushed out to some cloud serving CDN for actual serving to the public. The idea is to have the images stored in a reliable, back-uppable store. I would anticipate of the order of 200,000 objects of up to 10KB each and possibly fewer video files of a few MB.

By default I would go to Postgres which the documentation suggests would be ok.

  • Is this is a sensible idea?
  • Will it make backing up the database a complete nightmare. Experiences?
  • Any reliability issues?
  • Will this affect the performance for other parts of the db? Bear in mind that the db will only be hit once or twice for each image.
Joe
  • 46,419
  • 33
  • 155
  • 245
  • possible duplicate of [Storing Images in DB - Yea or Nay?](http://stackoverflow.com/questions/3748/storing-images-in-db-yea-or-nay) – Mat May 17 '11 at 11:42
  • please re-read the [FAQ](http://stackoverflow.com/faq), your question isn't really answerable objectively, you're requesting opinions. – Mat May 17 '11 at 11:43
  • Requestion experienses... that's good, right? – GolezTrol May 17 '11 at 11:45
  • @Mat what happened to CWs? And I considered my question (including the part about not serving them from the DB) to be sufficiently different. Oh well. – Joe May 17 '11 at 11:46

3 Answers3

6

I've got experience with storing images in a database this way in Oracle and MySQL. Performance and reliability are not an issue. Backing up is. Your backup will get very large. Since backing up is time consuming and expensive, it might be a good idea to save space. If that means you can shrink your database by 80% by just removing the images from the database, it might be a good idea to store them elsewhere. Backing up separate files is more efficient, because you can easily create incremental backups containing only new and modified images.

GolezTrol
  • 114,394
  • 18
  • 182
  • 210
4

I have experiences with PostgreSQL, storing images as ByteA (a BLOB-like datatype), a good experience, and storing images in "dual solution" (images at filesystem, metadata at databases like MySQL and PostgreSQL), that I not recommend.

There are 3 aspects, or architecture considerations, that can help us in our decision:

  1. Unify solution or not? Today, when we see that image volume (sizes and number of images) are growing and growing, in all applications, the "unified solutions" are the goal. Example: Wikimedia is a unified and specialized solution for Wikipedia.
  2. Direct or indirect store? Like old "dual solutions", that not store image into the SQL table, some solutions can use external database or external data pointer... On PostgreSQL BLOB datatypes have indirect store (generates a separated backup), and BYTEA datatype is direct (backup-ed with tables). The choice need technical and performance considerations.
  3. Original or processed images? We need some distinction between "original image" and "processed image", like thumbnail, that need database store (for caching!), but not need backup.

I recommend:

  • to store as blob (Binary Large OBject with indirect store) at your table: for original image store, but separated backup. See Ivan's answer, PostgreSQL additional supplied modules, How-tos etc.

  • to store as bytea (or blob), at a separated database (with DBlink): for original image store, at another (unified) database. In this case, I preffer bytea, but blob is near the same. Separating database is the best way for a "unified image webservice".

  • to store as bytea (BYTE Array with direct store) at your table: for caching processed images (typically thumbnails). Cache the little images to send it fast to the web-browser (avoiding renderization problems) and reduce server processing. Cache also the essential metadata, like width and height. Database caching is the easiest way, but check your needs and server configs (ex. Apache modules): store thumbnails at file system may be better, compare performances. Remember that it is a (unified) web-service, then can be stored at a separete database with no backups, serving many tables. See also PostgreSQL binary data types manual, tests with bytea column, etc.

Community
  • 1
  • 1
Peter Krauss
  • 13,174
  • 24
  • 167
  • 304
2

My experience is limited to SQL server, but I have several million PDF-files that are larger than 10KB in a database, which is still performing quite nicely. Of course indexes are required. Full database backup takes no longer than expected with such an amount of data. Again, this is for MS-SQL server!

Mr47
  • 2,655
  • 1
  • 19
  • 25