how to allocate memory for arrays of structure of arrays

Question

So I have a struct as shown below, I would like to create an array of that structure and allocate memory for it (using malloc).

typedef struct {
    float *Dxx;
    float *Dxy;
    float *Dyy;
} Hessian;

My first instinct was to allocate memory for the whole structure, but then, I believe the internal arrays (Dxx, Dxy, Dyy) won't be assigned. If I assign internal arrays one by one, then the structure of arrays would be undefined. Now I think I should assign memory for internal arrays and then for the structure array, but it seems just wrong to me. How should I solve this issue?

I require a logic for using malloc in this situation instead of new / delete because I have to do this in cuda and memory allocation in cuda is done using cudaMalloc, which is somewhat similar to malloc.

You solve the issue by learning how to use `std::vector`, and letting the C++ library to do the work for you. See your C++ book for more information. — Sam Varshavchik, Aug 29 '17 at 13:21
Don't. Make those members `std::vector`s and then have a `std::vector ` of those structs. — NathanOliver, Aug 29 '17 at 13:21
First of all never use `malloc` in C++. Secondly, why not use [`std::vector`](http://en.cppreference.com/w/cpp/container/vector)? Or maybe even [`std::array`](http://en.cppreference.com/w/cpp/container/array) if the size is fixed and known at compile-time? — Some programmer dude, Aug 29 '17 at 13:21
Suppose you had only *one* `Hessian`. How would you construct its member arrays? Now what if you had *two* of them? You seem to be making at least one false assumption about memory management. — Beta, Aug 29 '17 at 13:27
Of course you can do this in a number of ways, but using an array of these structures in CUDA will be somewhat complicated and may not be the way to achieve best performance. Anyway [here](https://stackoverflow.com/questions/16024087/copy-an-object-to-device) is a worked example in CUDA of allocating and using an array of structures, where the structure has an embedded pointer. — Robert Crovella, Aug 29 '17 at 15:58

muXXmit2X · Answer 1 · 2017-08-29T13:49:18.547

In C++ you should not use malloc at all and instead use new and delete if actually necessary. From the information you've provided it is not, because in C++ you also rather use std::vector (or std::array) over C-style-arrays. Also the typedef is not needed.

So I'd suggest rewriting your struct to use vectors and then generate a vector of this struct, i.e.:

struct Hessian {
  std::vector<float> Dxx;
  std::vector<float> Dxy;
  std::vector<float> Dyy;
}; 

std::vector<Hessian> hessianArray(2); // vector containing two instances of your struct
hessianArray[0].Dxx.push_back(1.0); // example accessing the members

Using vectors you do not have to worry about allocation most of the time, since the class handles that for you. Every Hessian contained in hessianArray is automatically allocated for you, stored on the heap and destroyed when hessianArray goes out of scope.

kocica · Answer 2 · 2017-08-29T14:03:52.760

It seems like problem which could be solved using STL container. Regarding the fact you won't know sizes of arrays you may use std::vector.

It's less error-prone, easier to maintain/work with and standard containers free their resources them self (RAII). @muXXmit2X already shown how to use them.

But if you have/want to use dynamic allocation, you have to first allocate space for array of X structures

Hessian *h = new Hessian[X];

Then allocate space for all arrays in all structures

for (int i = 0; i < X; i++)
{
    h[i].Dxx = new float[Y];
    // Same for Dxy & Dyy
}

Now you can access and modify them. Also dont forget to free resources

for (int i = 0; i < X; i++)
{
    delete[] h[i].Dxx;
    // Same for Dxy & Dyy
}
delete[] h;

You should never use malloc in c++.

Why?

new will ensure that your type will have their constructor called. While malloc will not call constructor. The new keyword is also more type safe whereas malloc is not typesafe at all.

_It will ensure that your type will have their constructor called._ You mean `new`, right? The way you wrote it, it sounds like `malloc` would call the constructor, which it doesn't. — muXXmit2X, Aug 29 '17 at 13:48

Akira · Answer 3 · 2017-08-30T08:18:18.587

As other answers point out, the use of malloc (or even new) should be avoided in c++. Anyway, as you requested:

I require a logic for using malloc in this situation instead of new / delete because I have to do this in cuda...

In this case you have to allocate memory for the Hessian instances first, then iterate throug them and allocate memory for each Dxx, Dxy and Dyy. I would create a function for this like follows:

Hessian* create(size_t length) {
    Hessian* obj = (Hessian*)malloc(length * sizeof(Hessian));

    for(size_t i = 0; i < length; ++i) {
        obj[i].Dxx = (float*)malloc(sizeof(float));
        obj[i].Dxy = (float*)malloc(sizeof(float));
        obj[i].Dyy = (float*)malloc(sizeof(float));
    }

    return obj;
}

To deallocate the memory you allocated with create function above, you have to iterate through Hessian instances and deallocate each Dxx, Dxy and Dyy first, then deallocate the block which stores the Hessian instances:

void destroy(Hessian* obj, size_t length) {
    for(size_t i = 0; i < length; ++i) {
        free(obj[i].Dxx);
        free(obj[i].Dxy);
        free(obj[i].Dyy);
    }

    free(obj);
}

Note: using the presented method will pass the responsibility of preventing memory leaks to you.

If you wish to use the std::vector instead of manual allocation and deallocation (which is highly recommended), you can write a custom allocator for it to use cudaMalloc and cudaFree like follows:

template<typename T> struct cuda_allocator {
    using value_type = T;

    cuda_allocator() = default;
    template<typename U> cuda_allocator(const cuda_allocator<U>&) {
    }

    T* allocate(std::size_t count) {
        if(count <= max_size()) {
            void* raw_ptr = nullptr;

            if(cudaMalloc(&raw_ptr, count * sizeof(T)) == cudaSuccess)
                return static_cast<T*>(raw_ptr);
        }
        throw std::bad_alloc();
    }
    void deallocate(T* raw_ptr, std::size_t) {
        cudaFree(raw_ptr);
    }
    static std::size_t max_size() {
        return std::numeric_limits<std::size_t>::max() / sizeof(T);
    }
};

template<typename T, typename U>
inline bool operator==(const cuda_allocator<T>&, const cuda_allocator<U>&) {
    return true;
}
template<typename T, typename U>
inline bool operator!=(const cuda_allocator<T>& a, const cuda_allocator<U>& b) {
    return !(a == b);
}

The usage of an custom allocator is very simple, you just have to specify it as second template parameter of std::vector:

struct Hessian {
    std::vector<float, cuda_allocator<float>> Dxx;
    std::vector<float, cuda_allocator<float>> Dxy;
    std::vector<float, cuda_allocator<float>> Dyy;
};

/* ... */

std::vector<Hessian, cuda_allocator<Hessian>> hessian;

how to allocate memory for arrays of structure of arrays

3 Answers3