16

I was saving my files on the FS of my server and now I want to save them in the mongodb.(for easier backup and stuff).I want to store files like 4-5Mb maximum and I tried save them with mongoose with Buffer type.I successfully saved them and retrieved them but I noticed a significant slow performance when i save and retrieve files like 4 or 5Mb.

My schema:

let fileSchema = new Schema({
name: {type: String, required: true},
_announcement: {type: Schema.Types.ObjectId, ref: 'Announcements'},
data: Buffer,
contentType: String
});

How I retrieve them from the expressjs server:

 let name = encodeURIComponent(file.name);
 res.writeHead(200, {
     'Content-Type': file.contentType,
     'Content-Disposition': 'attachment;filename*=UTF-8\'\'' + name
 });
 res.write(new Buffer(file.data));

My question is should I use some zlib compress functions like 'deflate' to compress buffer before saving them in the mongodb and then uncompress the binary before sending them to the client? Would this make the whole proccess faster?Am I missing something?

Alexis Pavlidis
  • 1,080
  • 1
  • 11
  • 21
  • 1
    There is no definite answer to this. This depends on what kind of data you are storing? If it is jpeg/png it may already be compressed and additional compression won't help. If the size of file is small then also it may not help to compress. Next things DB is not good option if the filesize is big. – Tarun Lalwani Jul 28 '19 at 16:38
  • @TarunLalwani, correct me if I am wrong. Is the standard way of storing images storing the image in s3 or cloudinary, (or I guess even imgur works) then in DB store the URL of the image stored? If that is the case, I am having trouble understanding why retrieving data from an external website's database is faster than my own. – Acy Jul 29 '19 at 01:35
  • 2
    Storing in DB doesn't make sense because they are not optimized for storing such things. While S3 and Cloufront are optimized for serving such files, caching, nearby nodes for lower latency and everything. That is the reason it make sense to use an external service. But you can still go for mongodb if you want to reduce cost, but that cost will be put into development with your code itself – Tarun Lalwani Jul 29 '19 at 07:57
  • 1
    Sounds like you should be using [GridFS](https://docs.mongodb.com/manual/core/gridfs/) – Wyck Aug 01 '19 at 18:33
  • @TarunLalwani there is a definite answer - which is exactly this: do not use database to store files... Store files outside of database, the only thing that database should store is information required to access real files, for example storing it in AWS S3 bucket... And storing file name and bucket name in DB field so whoever accesses it, gets information about where the real file is stored, and can retrieve it at a later date. –  Aug 04 '19 at 09:40

3 Answers3

8

It seems that you are trying to save a really big amount of information with mongoDb.

I can think in 3 diferent options for your case

Cloud Services

  • As other people already comment here, if the file that you are saving is a compressed one, even if its a small file, the new compression wont help you. In this cases, my recomendation is to use some web cloud service that is already optimized for the kind of information that you are trying to save and retrive, if its an image you could use Cloudinary that also has a free service so you can test it.

Local Storage and saving routes in DB

  • Other solution could be storing the encoded data in a .txt file, storing it in a cloud or in your file sistem, and then only save the routing in the database. This way you will not depend on the mongoDB speed for retriving it but you will have a good way to know where the files are located.

Using MongoDB and GridFS

  • This way you can use a specific method to store information in MongoDB that is recomended when you are dealing with files that are 16mb. As the Official Documentation says:

Instead of storing a file in a single document, GridFS divides the file into parts, or chunks [1], and stores each chunk as a separate document. By default, GridFS uses a default chunk size of 255 kB; that is, GridFS divides a file into chunks of 255 kB with the exception of the last chunk.

And next they say in what situations you may use this way of storing information:

In some situations, storing large files may be more efficient in a MongoDB database than on a system-level filesystem.

  • If your filesystem limits the number of files in a directory, you can use GridFS to store as many files as needed.
  • When you want to access information from portions of large files without having to load whole files into memory, you can use GridFS to recall sections of files without reading the entire file into memory.
  • When you want to keep your files and metadata automatically synced and deployed across a number of systems and facilities, you can use GridFS. When using geographically distributed replica sets, MongoDB can distribute files and their metadata automatically to a number of mongod instances and facilities.

Hope it was useful :)

Lautaro Jayat
  • 394
  • 3
  • 7
0

I will suggest you to use GridFS it's faster and very easy to use.

For more info please check this url: https://docs.mongodb.com/manual/core/gridfs/.

If you have any question about GridFS let me know.

andranikasl
  • 1,242
  • 9
  • 10
0

If you absolutely feel that you must store the images in your Database and not in filesystem or other cloud services, I wont comment on that.

With respect to your specific question, GridFS is a respectable option which people use in production as well and has served its purpose quite well. I personally used it couple of years back but my use case changed therefore moved to another medium. (Please check the SO link where people are discussing its performance)

What is of concern is the fact that you have 4mb images, unless you are serving images with huge dependency on quality and big resolution - that should not happen. Please compress your images before storing them, do it on the frontend or backend (your choice), if you compress them on frontend itself then it will reduce the transmission time of packets.

Discussion on scale of GridFS

Module for node.js side compression

GridFS

Gandalf the White
  • 2,415
  • 2
  • 18
  • 39