5

I hope that this question is not OT.

I'm implementing a VLAD encoder using the VLFeat implementation and SIFT descriptors from different implementations to compare them (OpenCV, VLFeat, OpenSIFT).

This is supposed to be an high performance application in C++ (I know that SIFT is very inefficient, I'm implementing a parallel version of it).

Now, VLAD wants as input the pointer to a set of contiguous descriptors (math vectors). The point is that usually this SIFT descriptors are represented as a matrix, so it's easier to manage them.

So supposing that we have a matrix of 3 descriptors in 3 dimensions (I'm using these numbers for sake of simplicity, actually it's thousands of descriptors in 128 dimensions):

1 2 3
4 5 6
7 8 9

I need to do feed vl_vlad_encode with the pointer to:

1 2 3 4 5 6 7 8 9

An straightforward solution is saving descriptors in a cv::Mat m object and then pass m.data to vl_vlad_encode.

However I don't know if cv::Mat is an efficient matrix representation. For example, Eigen::Matrix is an alternative (I think it's easy to obtain the representation above using this object), but I don't know which implementation is faster/more efficient or if there is any other reason because I should prefer one instead of the other.

Another possible alternative is using std::vector<std::vector<float>> v, but I don't know if using v.data() I would obtain the representation above instead of: 1 2 3 *something* 4 5 6 *something* 7 8 9

Obviously *something* would mess up vl_vlad_encode.

Any other suggestion is more than welcome!

justHelloWorld
  • 6,478
  • 8
  • 58
  • 138
  • 2
    `float [9]`? Agree on a column- or row-major convention and then you can layout everything contiguously one column or one row after the other. – Andon M. Coleman Dec 14 '16 at 11:15
  • @AndonM.Coleman care to explain the difference between float[9] and float[3][3]? They're both contiguous and the column/row convention is changeable for both. – UKMonkey Dec 14 '16 at 11:19
  • I forgot to say that the matrix dimension is decided at runtime, so using `std::vector v` and then `v.resize(dim)` (or `v.reserve(dim)`) could be a better solution, where `dim=9` in this case. – justHelloWorld Dec 14 '16 at 11:20
  • 1
    Unless you do some weird stuff (see [here](http://stackoverflow.com/a/33674655/5008845) for details), data in a `Mat` are continuous. You can think of a `Mat` as a lightweight wrapper over a `float*` (or other types) that allows easier access to the data. So it's as efficient as a pointer, but with a few nice-to-have abstractions. – Miki Dec 14 '16 at 11:23
  • This is what I wanted to hear @Miki, thanks so much. It would be much easier to use `Mat` for SIFT and to write/read it to file (using xml and yaml) files. So you say that there is no loss in time/memory performances, but it's easier to manage them, right? – justHelloWorld Dec 14 '16 at 11:25
  • 1
    Correct. You can also improve write/read to file performance using [this](http://stackoverflow.com/a/32357875/5008845). Using xml or yaml can be too slow. if you don't need human-readable files, you can save in binary form with the functions in the link – Miki Dec 14 '16 at 11:28
  • @Miki absolutely better, I was looking for a binary representation of `Mat`. If you'll post it as an answer and no one will come up with a better solution I will chose it! – justHelloWorld Dec 14 '16 at 11:36
  • test if mat is continuous and if it isn't: use mat.clone() to create a continuous copy of your mat. How big are your matrices? Will it have any real influence on processing speed if you just copy them to a vector of floats (not vector of vector of float!)? – Micka Dec 14 '16 at 13:04

2 Answers2

5

Unless you do some weird stuff (see here for details), data in a Mat are guaranteed to be continuous. You can think of a Mat as a lightweight wrapper over a float* (or other types) that allows easier access to the data. So it's as efficient as a pointer, but with a few nice-to-have abstractions.

If you need to efficiently load/save from/to file, you can save the Mat in binary format using matread and matwrite.

Community
  • 1
  • 1
Miki
  • 40,887
  • 13
  • 123
  • 202
2

std::vector<std::vector<float>> v is not going to perform very well without some effort, since the memory will not be contiguous.

Once you have your memory contiguous, be it float[], float[][] or std::array/vector, how well it will perform will depend on how you iterate over your matrix. If it's random access, then it makes little difference; if you're iterating all columns per for then it's better to have your data grouped by column rather than row.

UKMonkey
  • 6,941
  • 3
  • 21
  • 30