I'm wondering if there's a reason why lmdb files using in caffe are so much larger than the file containing the original images. Could you give me an explanation please?
Asked
Active
Viewed 1,015 times
1 Answers
5
It is hard to give a concrete answer to such an abstract question, but I'll give it a try:
Image files are usually compressed: a .png
or .jpg
of size h
by w
by 3
takes far less disk space than h*w*3
bytes due to compression. On the other hand, for processing the image in a neural network (or any other ML software for that matter) you need to work with the un-compressed representation of the image. Therefore, lmdb
, leveldb
, hdf5
datasets used by caffe stores the input images in an uncompressed manner using 32bit float numbers for each pixels (instead of uint8
) thus the drastic increase in file size.

Shai
- 111,146
- 38
- 238
- 371
-
1Thank you for the answer. I just want to verify that this situation is usual and I didnt make any mistakes. Thank you Shai – Skonitsa Dec 18 '15 at 15:07
-
1@user5640428: Just in addition to this answer: **caffe** can work with compressed images stored in `lmdb`. If you use *convert_imageset* tool from **caffe**, you can pass `-encode_type=png` or `-encode_type=jpg` parameter to save encoded images to `lmdb`. It will significantly reduce your database size, but will take additional time in training/testing phase for decoding images. – avtomaton Dec 18 '15 at 15:31
-
@avtomaton , I would be interested in using this technique, since I am using a developer board with almost no memory , and need to use a USB stick to feed it the info. Have you tried this before successfully? Also, would I have to add additional commands when running my net, or would caffe decode it by default? Thanks – jerpint Jun 03 '16 at 15:37
-
@jerpint This is an interesting question. why burry it in a comment? please consider post it as a question – Shai Jun 03 '16 at 16:01
-
@Shai I have posted a question [here](http://stackoverflow.com/questions/37661134/compressing-lmdb-files) , thanks – jerpint Jun 06 '16 at 15:23