6

I am writing a Matlab extension using the C++ ublas library, and I would like to be able to initialize my ublas vectors from the C arrays passed by the Matlab interpeter. How can I initialize the ublas vector from a C array without (for the sake of efficiency) explicitly copying the data. I am looking for something along the following lines of code:

using namespace boost::numeric::ublas;

int pv[10] = { 5, 5, 5, 5, 5, 5, 5, 5, 5, 5 };
vector<int> v (pv);

In general, is it possible to initialize a C++ std::vector from an array? Something like this:

#include <iostream>
#include <vector>
using namespace std;

int main()
{
    int pv[4] = { 4, 4, 4, 4};
    vector<int> v (pv, pv+4);

    pv[0] = 0;
    cout << "v[0]=" << v[0] << " " << "pv[0]=" << pv[0] << endl;

    return 0;
}

but where the initialization would not copy the data. In this case the output is

v[0]=4 pv[0]=0

but I want the output to be the same, where updating the C array changes the data pointed to by the C++ vector

v[0]=0 pv[0]=0
D R
  • 21,936
  • 38
  • 112
  • 149

6 Answers6

8

I'm not sure how your question relates to MATLAB/MEX, but a side note, you might want to know that MATLAB implements a copy-on-write strategy.

This means that when you copy an array for example, only some headers are actually copied, while the data itself is shared between the two arrays. And once one of them is modified, a copy of the data is actually made.

The following is a simluation of what might be happening under the hood (borrowed from this old post):

-----------------------------------------
>> a = [35.7 100.2 1.2e7];

 mxArray a
    pdata -----> 35.7 100.2 1.2e7
  crosslink=0

-----------------------------------------
>> b = a;

 mxArray a
    pdata -----> 35.7 100.2 1.2e7
  crosslink     / \
    |  / \       |
    |   |        |
    |   |        |
   \ /  |        |
   crosslink     |
 mxArray b       |
    pdata --------

-----------------------------------------
>> a(1) = 1;

mxArray a
    pdata -----> (1) 100.2 1.2e7
  crosslink=0


   crosslink=0
 mxArray b
    pdata ------> 35.7 100.2 1.2e7 ...

I know this doesn't really answer your question, I just thought you might find the concept helpful.

Amro
  • 123,847
  • 25
  • 243
  • 454
  • 11
    You can see this meta-data in the MATLAB Command Window setting format with `format debug` – Mikhail Poda Aug 07 '10 at 09:25
  • A minor point about your diagram - you make it look as thought MATLAB creates a new copy of the data, reassigns `b` to point to it, and the mutates the data that `a` points to. What actually happens is that a new copy of the data is created and *`a`* is reassigned to point to it, and then the new data is mutated. – Chris Taylor Mar 03 '15 at 09:09
  • This has little to no relevance to the question. The question is about C++. If you use matlab classes, they might have some compile time optimizations against redundant copy. Once you take raw pointer from it, matlab can not prevent other libraries from trying to do useless copies. In fact, the very operation of requesting a raw pointer will trigger the matlab to do copy of the input parameters. – Dimitry Aug 25 '18 at 11:15
  • @Dimitry downvote is fair enough as it doesn't answer the question, still I'll keep this 9 yo answer even if it's slightly relevant... As to your last statement, I should correct you and say that at the MEX-API level, MATLAB does not create a copy when you request the raw data of a numeric array (i.e `mxGetData` and the like). Of course that doesn't prevent you from making copies by wrapping the raw pointer inside a `std::vector`. – Amro Aug 26 '18 at 02:53
  • Well, to be honest, my experiences come from Octave, which google tend to redirect to matlab answers anyway. By providing raw memory pointer Matlab\Octave array containers give up any control they had over the memory access and in order to guarantee that side effects will not be caused by using that pointer, they have to make sure that memory is not shared by other objects. Octave does this on the pointer request unless you pepper everything with `const` modifiers. Matlab mex compiler might be more elaborate and trace the pointer usage or it might not. – Dimitry Aug 26 '18 at 13:35
  • @Dimitry Nope, the MEX-API simply hands you the pointer and it's up to you to not shoot yourself in the foot :) – Amro Aug 26 '18 at 16:28
6

Both std::vector and ublas::vector are containers. The whole point of containers is to manage the storage and lifetimes of their contained objects. This is why when you initialize them they must copy values into storage that they own.

C arrays are areas of memory fixed in size and location so by their nature you can only get their values into a container by copying.

You can use C arrays as the input to many algorithm functions so perhaps you can do that to avoid the initial copy?

CB Bailey
  • 755,051
  • 104
  • 632
  • 656
  • 2
    Except that *in theory* you could create a subclass of ublas::vector that did this. Your subclass could behave as a const ublas::vector that could never be resized, or you'd have to override all of the methods involved in resizing the container to insure that the don't free up memory that doesn't belong to it. Only a complete masochist would attempt this. – Die in Sente Nov 16 '09 at 17:55
4

You can initialize a std::vector from a C array easily:

vector<int> v(pv, pv+10);
ebo
  • 8,985
  • 3
  • 31
  • 37
  • Thanks for your answer, but this would copy the data. I want `v` and `pv` to point to the same block of data. – D R Nov 14 '09 at 22:50
  • 1
    You can't have that. std::vector always owns its memory. You can write your own vector class though... – shoosh Nov 14 '09 at 22:56
4

There are two undocumented classes in uBLAS storage.hpp. You can change the default storage class (unbounded_array) in ublas::vector with one of these.

  • The first class, array_adaptor, makes a copy of your data when ublas::vector calls to copy constructor, not very useful class at all. I would rather simply the appropriate constructor to do this in unbounded_array or bounded_array classes.
  • The second, shallow_array_adaptor, only hold a reference of your data, so you can use vector to directly modify your C array. Unfortunately, it has some bugs, when you assign an expression it losses the original data pointer. But you can create a derived class that fix this problem.

Here the patch and an example:

// BOOST_UBLAS_SHALLOW_ARRAY_ADAPTOR must be defined before include vector.hpp
#define BOOST_UBLAS_SHALLOW_ARRAY_ADAPTOR

#include <boost/numeric/ublas/vector.hpp>
#include <algorithm>
#include <iostream>

// Derived class that fix base class bug. Same name, different namespace.    
template<typename T>
class shallow_array_adaptor
: public boost::numeric::ublas::shallow_array_adaptor<T>
{
public:
   typedef boost::numeric::ublas::shallow_array_adaptor<T> base_type;
   typedef typename base_type::size_type                   size_type;
   typedef typename base_type::pointer                     pointer;

   shallow_array_adaptor(size_type n) : base_type(n) {}
   shallow_array_adaptor(size_type n, pointer data) : base_type(n,data) {}
   shallow_array_adaptor(const shallow_array_adaptor& c) : base_type(c) {}

   // This function must swap the values ​​of the items, not the data pointers.
   void swap(shallow_array_adaptor& a) {
      if (base_type::begin() != a.begin())
         std::swap_ranges(base_type::begin(), base_type::end(), a.begin());
   }
};

void test() {
    using namespace boost::numeric;
    typedef ublas::vector<double,shallow_array_adaptor<double> > vector_adaptor;

    struct point {
        double x;
        double y;
        double z;
    };

    point p = { 1, 2, 3 };
    vector_adaptor v(shallow_array_adaptor<double>(3, &p.x));

    std::cout << p.x << ' ' << p.y << ' ' << p.z << std::endl;
    v += v*2.0;
    std::cout << p.x << ' ' << p.y << ' ' << p.z << std::endl;
}

Output:

1 2 3
3 6 9
Guillermo Ruiz
  • 401
  • 5
  • 8
3

The usual suggestion to use shallow array adaptor seems kind of sarcastic to me - to be able to simply access an array through a pointer you're supposed to put it into a shared_array with all the reference counting shebang (that comes to nothing, since you don't own the array) and what's more with a nightmare of data-aliasing. Actually, uBLAS has a fully-fledged implementation of storage (array_adaptor) which allows to use vectors with external c arrays. The only catch is vector constructor which makes a copy. Why this nice feature is not used in the library is quite beyond me, but anyway, we can use a little extension (it's actually 2 lines of code surrounded with usual c++ bloat)

template<class T>
class extarray_vector :
    public vector<T, array_adaptor<T> >
{
    typedef vector<T, array_adaptor<T> > vector_type;
public:
    BOOST_UBLAS_INLINE
    extarray_vector(size_type size, pointer p)
    { data().resize(size, p); }

    template <size_type N>
    BOOST_UBLAS_INLINE
    extarray_vector(T (&a)[N])
    { data().resize(N, a); }

    template<class V>
    BOOST_UBLAS_INLINE
    extarray_vector& operator = (const vector<T, V>& v)
    {
        vector_type::operator = (v);
        return *this;
    }

    template<class VC>
    BOOST_UBLAS_INLINE
    extarray_vector& operator = (const vector_container<VC>& v)
    {
        vector_type::operator = (v);
        return *this;
    }

    template<class VE>
    BOOST_UBLAS_INLINE
    extarray_vector& operator = (const vector_expression<VE>& ae)
    {
        vector_type::operator = (ae);
        return *this;
    }
};

you can use it like this:

int i[] = {1, 4, 9, 16, 25, 36, 49};
extarray_vector<int> iv(i);
BOOST_ASSERT_MSG(i == &iv[0], "Vector should attach to external array\n");
iv[3] = 100;
BOOST_ASSERT(i[3] == 100);
iv.resize(iv.size() + 1, true);
BOOST_ASSERT_MSG(i != &iv[0], "And detach from the array on resize\n");
iv[3] = 200;
BOOST_ASSERT(i[3] == 100);
iv.data().resize(7, i, 0);
BOOST_ASSERT_MSG(i == &iv[0], "And attach back to the array\n");
BOOST_ASSERT(i[3] == 200);

You can dynamically attach and detach vector to external storage via array_adaptor's resize method (keeping or discarding data). On resize it detaches from storage automatically and becomes regular vector. Assignment from containers goes directly into storage, but assignment from expression is done via a temporary and vector is detached from storage, use noalias() to prevent that. There's a small overhead in constructor since data_ is private member and we have to default initialize it with new T[0], then reassign to external array. You may change it to protected and assign to storage directly in the constructor.

panda-34
  • 4,089
  • 20
  • 25
2

Here are a couple of functions for syntactically convenient assignment (admittedly not initialization):

vector<int> v;
setVector(v, 3, 
          1, 2, 3);

matrix<int> m;
setMatrix(m, 3, 4,
            1,   2,   3,   4,
           11,  22,  33,  44,
          111, 222, 333, 444);

The functions:

/**
 * Resize a ublas vector and set its elements
 */
template <class T> void setVector(vector<T> &v, int n, ...)
{
    va_list ap;
    va_start(ap, n);
    v.resize(n);
    for (int i = 0; i < n; i++) {
        v[i] = va_arg(ap, T);
    }
    va_end(ap);
}

/**
 * Resize a ublas matrix and set its elements
 */
template <class T> void setMatrix(matrix<T> &m, int rows, int cols ...)
{
    va_list ap;
    va_start(ap, cols);
    m.resize(rows, cols);
    for (int i = 0; i < rows; i++) {
        for (int j = 0; j < cols; j++) {
            m(i, j) = va_arg(ap, T);
        }
    }
    va_end(ap);
}
Nasorenga
  • 61
  • 6