2

I'm trying to insert a bunch of images into the LMDB format using the following snippet:

with env.begin(write=True) as txn:
for i in range(N):
    datum = caffe.proto.caffe_pb2.Datum()
    datum.channels = X.shape[1]
    datum.height = X.shape[2]
    datum.width = X.shape[3]
    datum.data = X[i].tobytes()  # or .tostring() if numpy < 1.9
    datum.label = int(y[i])
    str_id = '{:08}'.format(i)

# The encode is only essential in Python 3
    txn.put(str_id.encode('ascii'), datum.SerializeToString())

However, as the uncompressed bytes are written on the disk, the resultant file is so huge!! Accordingly, I'm wondering how I can set the encoding property to JPG in python. I am already aware that the very option is available in C++ api.

Thanks in advance

Saeed
  • 742
  • 1
  • 7
  • 21
  • why aren't you using the [`convert_imageset`](http://stackoverflow.com/a/31431716/1714410) utility to write the lmdb? – Shai Jan 10 '17 at 06:15
  • I am modifying and preparing my dataset in python (dealing with multilabel files), and hence I need to write the modified image and the labels directly to LMDB using python. Actually, I am looking for the opencv 'imencode' alterative in python – Saeed Jan 10 '17 at 17:48
  • caffe lmdb interface does not support multi label – Shai Jan 10 '17 at 18:39
  • 1
    I am already aware about this issue, and I'm creating a new lmdb dataset for labels only. I gonna have two lmdb files: 1 for data and one for labels. – Saeed Jan 10 '17 at 18:53

0 Answers0