My question may be quite similar to others, but not the same. In the information system that I'm developing, I faced the next problem: I need to store many image data (many ~ 100k per month and 1.2-1.5 million per year). I am thinking of two approaches: storing in DB as BLOB or storing in DB file path. From related questions I read that storing in DB only reference is cheaper for performance. Also no need to write handlers (in asp.net to get BLOB field from DB). But having over 1 million files - is it a good idea? It will be very hard to back up and when some files are accidentally deleted, the database is not consistent anymore. The approach to store all in DB does not have these problems. What would you recommend? Has someone faced this problem?
-
Possible duplicates: http://stackoverflow.com/questions/10838448/how-to-store-different-document-types http://stackoverflow.com/questions/3748/storing-images-in-db-yea-or-nay http://stackoverflow.com/questions/561447/store-pictures-as-files-or-in-the-database-for-a-web-app – Aaron Bertrand Jul 05 '12 at 20:18
-
I have read these questions. If you think they are identical - please vote down. As for me, they are not identical AND there is not answer to my question. – seeker Jul 05 '12 at 20:22
-
what question do you have that wasn't addressed or answered in any of the duplicates I posted? – Aaron Bertrand Jul 05 '12 at 20:23
-
for example which approach will be better in terms of performance? – seeker Jul 05 '12 at 20:29
-
See, now that just tells me that you may have *seen* those questions but you haven't *read* any of the answers. Please go read, focusing on the answers with higher vote counts. I see several examples that address performance directly and explicitly. – Aaron Bertrand Jul 05 '12 at 20:34
-
Well, If in one question is said that better to store as file refference , and another is said to store in DB - it's not obvious what to choose (I can post direct links if needed) . I don't want to argue, I just wish to have my question answered and as of now it is answered quite correctly (not there). – seeker Jul 05 '12 at 20:46
-
Do you see though how the answers you're getting, since you've asked the same question as the others, is not getting you any closer to the answer you're after? *Please* go read the other answers. I still think you're ignoring some valid advice that has already been posted. – Aaron Bertrand Jul 05 '12 at 20:49
4 Answers
You miss one critical thing. Storing the files in the database WHICH STORES IT IN THE FILE SYSTEM, not in the MDF file. Look in the documentation for FileStream. Integrated backup, full access via SQL, but the files are on a separate share and / or even another computer.

- 61,059
- 10
- 88
- 148
-
1So what would be better in terms of performance? Generated table with 1000000 same images with unique ID. For each second only 54 rows are selected on my PC. Quite slow – seeker Jul 05 '12 at 20:24
-
2Well, YOUR PC is quit likely a crap machine for a database? How many hard discs do you have? I mean, seriously - database servers are custom build for IO CAPACITY - not for CPU. "my pc" is normal a synonym for a "totally crap db server". – TomTom Jul 05 '12 at 20:26
-
-
1As I said - TOTAL CRAP DATABASE SERVER. Start adding decent hard discs - db servers live and die by database. my SMALL Server has 2 SSD + 8 Velociraptors for my small database, on top of a system drives. This is SMALL for larger setups. You get 150 IOPS out of your disc system - I get 350 out of every drive, plus 40000 out of the SSDk. Start the math. Database servers LIVE AND DIE BY DISC PERFORMANCE and yours is dead slow. – TomTom Jul 05 '12 at 20:56
-
I would expect your database server to run a SAS disc subsystem. SuperMicro has nice cases - 72 discs in one cmputer case for exactly that, plus then 8 discs per expansion system. Using 15k RPM SAS drives, possibly with SSD as cache. WHen you talk million files, you must talk enterprise storage. – TomTom Jul 05 '12 at 20:57
-
Is it possible in internet (as a service) to test DB performance on normal enterprice server? – seeker Jul 05 '12 at 21:02
-
What you mean? Db performancs doesnot care about the internet. Internet "As a service" - you mean cloud? Welcome to shitty performance. NOONE uses a public cloud for high end database systems, simple like that. – TomTom Jul 06 '12 at 06:31
With that many files, backup is going to be a pain regardless. You're better off taking the cheaper approach for now.

- 17
- 3
I would go with database...If you have ever seen 1 million files in the same folder, it can be very difficult on the file system and tough to navigate

- 463
- 4
- 8
You can attack this in many ways. first of all, storing images in databases is a bad idea.
Better ways to do it is to store in a distributed file system.
have a scalable naming schema, and store that location info in the database or distributed hashtable. That s how all the corps do it.

- 52,984
- 76
- 209
- 300