I have two LMDB files, with the first one my network trains fine while with the other one doesn't really work (loss starts and stays at 0). So I figured that maybe there's something wrong with the second LMDB. I tried writing some python code (mostly taken from here) to fetch the data from my LMDBs and inspect it but so far no luck with any of the 2 databases. The LMDBs contain images as data and bounding box information as labels.
Doing this:
for key, value in lmdb_cursor:
datum.ParseFromString(value)
label = datum.label
data = caffe.io.datum_to_array(datum)
on either one of the LMDBs gives me a key which is correctly the name of the image, but that datum.ParseFromString
function is not able to retrieve anything from value
. label
is always 0, while data is an empty ndarray. Nonetheless, the data is there, value is a binary string of around 140 KB which correctly accounts for the size of the image plus the bounding box information I guess.
I tried browsing several answers and discussions dealing with reading data from LMDBs in python, but I couldn't find any clue on how to read structured information such as bounding box labels. My guess is that the parsing function expects a digit label and interprets the first bytes as such, with the remaining data being then lost due to the binary string not making any sense anymore?
I know for a fact that at least the first LMDB is correct since my network performs correctly in both training and testing using it.
Any inputs will be greatly appreciated!