1

I wonder if Caffe can take optical flow image as input, instead of RGB. I am aware that there is such library like FlowNet that learns optical flow, but that is not what I am aiming at.

Please provide me a pointer if any.

Shai
  • 111,146
  • 38
  • 238
  • 371
alfa_80
  • 427
  • 1
  • 5
  • 21
  • 2
    What do you mean by an "optical flow image" ? How many channels are there ? What is the value range in each pixel ? As far as I know, caffe can take an image with any number of channels (3 in case of RGB, 1 in case of grayscale etc) – Jayant Agrawal Jun 10 '17 at 18:08
  • 2
    caffe can use whatever input you feed it. It's up to you to make sure it makes sense. – Shai Jun 11 '17 at 07:17
  • Thanks Jayant & Shai. – alfa_80 Jun 12 '17 at 05:11
  • @Shai: Can you please make it as an answer out of the comment you've made so that I can mark it as correct. Thanks. – alfa_80 Jun 12 '17 at 05:13

1 Answers1

1

Caffe is a very flexible framework. It can process almost any shape of input data you might provide it with.
A very common way to input images to caffe is via lmdb/leveldb datasets created using convert_imageset tool.
For more complex input shapes one can use binary hdf5 files to be read using "HDF5Data" layer.

As for optical flow, you can input it as an image via lmdb or as a two-channel tensor via hdf5. Caffe can handle either way, it's up to you to make sure the net knows how to make sense of the input data.

Graham
  • 7,431
  • 18
  • 59
  • 84
Shai
  • 111,146
  • 38
  • 238
  • 371
  • Just curious, what do you mean by two-channel tensor in this context, by the way? – alfa_80 Jun 12 '17 at 09:42
  • 1
    @alfa_80 in many cases optical flow is given by an x-direction displacement and y-direction displacement per-location. Thus you have a 3D array (aka "tensor") with two channels (x-displacement, y-displacement) -by-height-by-width. – Shai Jun 12 '17 at 09:46
  • 1
    I see. Thanks a lot for the explanation Shai! – alfa_80 Jun 12 '17 at 09:53