5

I want to load my image and label data from a LMDB database I created. I assign a unique key to corresponding image-label pairs and add them to the LMDB (eg. image-000000001, label-000000001). While saving the images, I convert the numpy-array of the image to string using image.tostring(). Now while loading the LMDB, I see that I can get the labels very simply by passing the keys I generated, however the image-data is shown in an encoded fashion. Doing a numpy.fromstring(lmdb_cursor.get('image-000000001')) doesn't work.

I see here - the second answer, specifically, by @Ghilas BELHADJ that one has to use Caffe-datum objects to first load the data and then fetch the image using datum.data. But I don't have such a structure where the image and label are organised using the 'data' and 'label' tags. How does one read the data correctly back in the form of a numpy image from such an LMDB in python?

In Lua, this can be achieved as follows,

    local imgBin -- this is the object returned from cursor:get(image-id)
    local imageByteLen = string.len(imgBin)
    local imageBytes = torch.ByteTensor(imageByteLen):fill(0)
    imageBytes:storage():string(imgBin)
    local img = Image.decompress(imageBytes, 3, 'byte')
    img = Image.rgb2y(img)
    img = Image.scale(img, imgW, imgH)

I don't know how to do this in Python.

Community
  • 1
  • 1
NightFury13
  • 761
  • 7
  • 19

1 Answers1

1
import lmdb
import cv2
import numpy as np
with lmdb.open(lmdb_dir,readonly=True).begin(write=False) as txn:
    for idx,(key,val) in enumerate(txn.cursor()):
        img = cv2.imdecode(np.fromstring(val,dtype=np.uint8),1)
  • img = cv2.imdecode(np.frombuffer(val,dtype=np.uint8),1) just a small update, use np.frombuffer( ) as string doesnt work well if input is unicode. – Vinay Verma May 11 '23 at 09:10