0

I have a class holding an Eigen::Array of data and a method that adds new data (number of rows may vary) by appending to the array along the first axis. I solved the accumulation by creating a new Array of the right size and initializing it with the old and the new data.

typedef Eigen::Array<double, Eigen::Dynamic, 3> DataArray

class Accumulator {
    void add(DataArray &new_data) {
        DataArray accu(accumulated_data_.rows() + new_data.rows(), 3)
        accu << accumulated_data_, new_data;
        accumulated_data_ = accu;
    }

    DataArray accumulated_data_;
}

Is there anything wrong with doing it like this? Or is it preferred to resize the accumulated data array:

  • .resize() and copy in both old and new
  • or .conservative_resize() and copy in the new data (requires block operations if new data is longer than 1 row)
Johannes
  • 3,300
  • 2
  • 20
  • 35
  • 1
    The ideal solution really depends on many factors, e.g., how big will your accumulator typically grow? Do you know (an estimate of) the size beforehand? How often/how big are the `new_data` blocks? – chtz Apr 17 '18 at 22:23
  • Typical sizes will be: 7-10 columns (known at compile time, 3 was just an example), new_data: 5-100 rows (unknown), new_data blocks will arrive 1-100 times/second, accumulator will grow up to 10k-50k rows. This isnt much, but it should be fast, even on mobile devices. – Johannes Apr 18 '18 at 08:44

1 Answers1

2

First of all, two easy-to-fix flaws with your current implementation:

  • Eigen stores Arrays (and Matrices) in column-major order by default, so if you are appending rows, you should prefer the RowMajor storage order:

    Eigen::Array<double, Eigen::Dynamic, 3, Eigen::RowMajor>

  • Since accu will not be used anymore, you should move it to the accumulator: accumulated_data_ = std::move(accu);

    If you are pre-C++11, you can also swap the data:

    accumulated_data_.swap(accu);

Then your approach is nearly equivalent to

accumulated_data_.conservativeResize(accumulated_data_.rows() + new_data.rows(), 3);
accumulated_data_.bottomRows(new_data.rows()) = new_data;

You will still have memory (re-)allocations and memory-copies at every call.

A more efficient approach would be to resize the accumulated_data_ only occasionally (ideally just once at the beginning), and keep track of how much of it is currently actually valid:

typedef Eigen::Array<double, Eigen::Dynamic, 3, Eigen::RowMajor> DataArray;

class Accumulator {
public:
    Accumulator(Eigen::Index initialCapacity=10000) : accumulated_data_(initialCapacity, 3), actual_rows_(0) {}
    void add(DataArray &new_data) {
        if(actual_rows_+new_data.rows() > accumulated_data_.rows())
        { // TODO adapt memory-growing to your use case
             accumulated_data_.conservativeResize(2*actual_rows_+new_data.rows(), 3);
        }
        accumulated_data_.midRows(actual_rows, new_data.rows()) = new_data;
        actual_rows_+=new_data.rows();
    }

    DataArray accumulated_data_;
    Eigen::Index actual_rows_;
};
chtz
  • 17,329
  • 4
  • 26
  • 56
  • Thanks a lot! What about storing incoming Arrays in an std::vector and accumulating it into a new Eigen::Array when the data is required? As far as I understand this would be similar to your implementation. I should probably mention that new data is added far more often than the accumulated is read (~10x-100x as often). – Johannes Apr 18 '18 at 12:28
  • 1
    Yes, you can also use a `std::vector` for storage. See this related question: https://stackoverflow.com/questions/49813340/stdvectoreigenvector3d-to-eigenmatrixxd-eigen – chtz Apr 18 '18 at 14:36