3

I'm wondering if there's a reason why lmdb files using in are so much larger than the file containing the original images. Could you give me an explanation please?

Shai
  • 111,146
  • 38
  • 238
  • 371
Skonitsa
  • 96
  • 1
  • 9

1 Answers1

5

It is hard to give a concrete answer to such an abstract question, but I'll give it a try:
Image files are usually compressed: a .png or .jpg of size h by w by 3 takes far less disk space than h*w*3 bytes due to compression. On the other hand, for processing the image in a neural network (or any other ML software for that matter) you need to work with the un-compressed representation of the image. Therefore, lmdb, leveldb, hdf5 datasets used by caffe stores the input images in an uncompressed manner using 32bit float numbers for each pixels (instead of uint8) thus the drastic increase in file size.

Shai
  • 111,146
  • 38
  • 238
  • 371
  • 1
    Thank you for the answer. I just want to verify that this situation is usual and I didnt make any mistakes. Thank you Shai – Skonitsa Dec 18 '15 at 15:07
  • 1
    @user5640428: Just in addition to this answer: **caffe** can work with compressed images stored in `lmdb`. If you use *convert_imageset* tool from **caffe**, you can pass `-encode_type=png` or `-encode_type=jpg` parameter to save encoded images to `lmdb`. It will significantly reduce your database size, but will take additional time in training/testing phase for decoding images. – avtomaton Dec 18 '15 at 15:31
  • @avtomaton , I would be interested in using this technique, since I am using a developer board with almost no memory , and need to use a USB stick to feed it the info. Have you tried this before successfully? Also, would I have to add additional commands when running my net, or would caffe decode it by default? Thanks – jerpint Jun 03 '16 at 15:37
  • @jerpint This is an interesting question. why burry it in a comment? please consider post it as a question – Shai Jun 03 '16 at 16:01
  • @Shai I have posted a question [here](http://stackoverflow.com/questions/37661134/compressing-lmdb-files) , thanks – jerpint Jun 06 '16 at 15:23