Reading encoded image data from lmdb database in caffe

Question

I am relatively new to using caffe and am trying to create minimal working examples that I can (later) tweak. I had no difficulty using caffe's examples with MNIST data. I downloaded image-net data (ILSVRC12) and used caffe's tool to convert it to an lmdb database using:

$CAFFE_ROOT/build/install/bin/convert_imageset -shuffle -encoded=true top_level_data_dir/ fileNames.txt lmdb_name

To create an lmdb containing encoded (jpeg) image data. The reason for this is that encoded, the lmdb is about 64GB versus unencoded being about 240GB.

My .prototxt file that describes the net is minimal (a pair of inner product layers, mostly borrowed from the MNIST example--not going for accuracy here, I just want something to work).

name: "example"
layer {
  name: "imagenet"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "train-lmdb"
    batch_size: 100
    backend: LMDB
  }
}
layer {
  name: "imagenet"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "test-lmdb"
    batch_size: 100
    backend: LMDB
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "data"
  top: "ip1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 1000
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 1000
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip2"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}

When train-lmdb is unencoded, this .prototxt file works fine (accuracy is abysmal, but caffe does not crash). However, if train-lmdb is encoded then I get the following error:

data_transformer.cpp:239] Check failed: channels == img_channels (3 vs. 1)

Question: Is there some "flag" I must set in the .prototxt file that indicates that the train-lmdb is encoded images? (The same flag would likely have to be given to for the testing data layer, test-lmdb.)

A little research:

Poking around with google I found a resolved issue which seemed promising. However, setting the 'force_encoded_color' to true did not resolve my problem.

I also found this answer very helpful with creating the lmdb (specifically, with directions for enabling the encoding), however, no mention was made of what should be done so that caffe is aware that the images are encoded.

Shai · Accepted Answer · 2018-03-06T06:19:24.167

2

The error message you got:

data_transformer.cpp:239] Check failed: channels == img_channels (3 vs. 1)

means caffe data transformer is expecting input with 3 channels (i.e., color image), but is getting an image with only 1 img_channels (i.e., gray scale image).

looking ar caffe.proto it would seems like you should set the parameter at the transformation_param:

layer {
  name: "imagenet"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    scale: 0.00390625
    force_color: true  ##  try this
  }
  data_param {
    source: "train-lmdb"
    batch_size: 100
    backend: LMDB
    force_encoded_color: true  ## cannot hurt...
  }
}

edited Mar 06 '18 at 06:19

answered Aug 11 '16 at 13:38

Shai

111,146
38
238
371

This seems to work! Does this imply that there are some grayscale images mixed into the data? I thought/assumed they were all color (the handful I looked at were certainly in color). Perhaps I would have caught this if I'd used the --check_size option when creating the lmdb? I had previously resized all images to 256x256 so I didn't bother with the check. – TravisJ Aug 11 '16 at 14:00
@TravisJ I'm not certain `--check_size` checks the number of channels, AFAIK it only checks width and height, but I might be wrong about it. – Shai Aug 11 '16 at 14:17
I assumed the size would be width x height x depth (i.e. number of channels)... but I could be wrong... very new to caffe. – TravisJ Aug 11 '16 at 14:53

Reading encoded image data from lmdb database in caffe

1 Answers1