-3

I did not really pay attention to this question before, but it was quite suprising for me to understand that the most straightforward way to allocate memory for a 2D array as first creating array of pointers and then for each pointer assign a separate block of memory for actual array contents. Array can be then spread all over the memory and accessing to elements of several rows (like A[i][N-1] -> A[i+1][0]) can lead to performance losses.

Somewhere on stackoverflow I found a perfect C-like solution of allocating space in one malloc call. But using malloc and unsafe pointer casts is not a good C++ practice so I decided to investigate what can I google/code to recreate this C method and construct normal 2D arrays (A[m][n]) that will be stored in a single block of memory (rowise).

I want to know if there are any problems with such solutions. I here present a working code snippet that was compiled under Windows 10 in VS 2015 with /Wall giving no warnings regarding my code. Two functions allocate memory, C-like and C++-like. I also compiled it for x86 and x64 and sample logs can be found after code part.

Source:

#include<stdio.h>
#include<stdlib.h>


void allocate(int m, int n, int** & arr)
{
// Allocates memory that is sufficient for storing m pointers to rows plus m*n continuos array
arr = (int**)malloc(m * sizeof(int*) + m*n * sizeof(int));

// First block of memory (m*sizeof(int*)) should be assigned with pointers to rows from second block of memory
for (int i = 0; i < m; i++)
{
    // arr is int**, arr + 1 shift pointer for sizeof(int*) bytes
    // arr + m points to the beginning of second (array) memory block, but it is of int** type;
    // consider it as a pointer to a single block of (int*), thus cast
    *(arr + i) = (int*)(arr + m) + n*i;

    //(int*)(arr + m) is the beginning of array, + n*i shifts to i-th row, j - iterates along that row
    // finally, * allows to zero memory at that address
    for (int j = 0; j < n; j++)
        *((int*)(arr + m) + n*i + j) = 0;
}

}

void allocatecpp(int m, int n, int** & p)
{
// Uses reinterpret_cast<T> as it performs casts from int** to int*, which are not allowed with normal casts
// Allocates memory that is sufficient for storing m pointers to rows plus m*n continuos array; function ::operator new(size)
p = reinterpret_cast<int**> (::operator new(m*n * sizeof(int) + m*sizeof(int*)));

// Iterates through first block of memory, assigning addresses of respective rows
for (int i = 0; i < m; i++)
{
    // Logic is the same as in C-style, but with some C++ stuff
    *(p + i) = reinterpret_cast<int*>(p + m) + n*i;
    for (int j = 0; j < n; j++)
        // Zeros memory
        *(reinterpret_cast<int*>(p + m) + n*i + j) = 0;
}

}

int main()
{
// Define size of an array; array of m rows x n columns
int m = 5, n = 4;

// array pointer, not sure how to use smart pointer here

int** arr;

// Allocates memory
allocatecpp(m, n, arr);

// Normally iterates through a 2D array, assigning values that correspond to the element position; e.g. element [3][2] gets value 43
for (int i = 0; i < m; i++)
    for (int j = 0; j < n; j++)
        arr[i][j] = 10 * (i+1) + (j+1);

// Address of the beginning; value of arr treated as a pointer
printf("Address of the beginning: 0x%p\r\n", arr);

// Moves along row address space and asks for addresses of a specific row
for (int i = 0; i < m; i++)
    printf("Row %i at address 0x%p pointing to 0x%p\r\n", i, arr+i, *(arr + i));

printf("\r\n");

// A pointer to an actual block of memory that represents array as continuous memory piece; casted to int*
int* p = (int*)(arr + m);

// Iterates through the whole array and prints addresses of elements and values (which correspond to position in matrix)
for (int i = 0; i < m * n; i++)
    printf("0x%p\t%02i\r\n", p + i, *(p + i));

delete (arr);

return 0;
}

x86 output:

Address of the beginning: 0x00DADD18

Row 0 at address 0x00DADD18 pointing to 0x00DADD2C

Row 1 at address 0x00DADD1C pointing to 0x00DADD3C

Row 2 at address 0x00DADD20 pointing to 0x00DADD4C

Row 3 at address 0x00DADD24 pointing to 0x00DADD5C

Row 4 at address 0x00DADD28 pointing to 0x00DADD6C



0x00DADD2C  11

0x00DADD30  12

0x00DADD34  13

0x00DADD38  14

0x00DADD3C  21

0x00DADD40  22

0x00DADD44  23

0x00DADD48  24

0x00DADD4C  31

0x00DADD50  32

0x00DADD54  33

0x00DADD58  34

0x00DADD5C  41

0x00DADD60  42

0x00DADD64  43

0x00DADD68  44

0x00DADD6C  51

0x00DADD70  52

0x00DADD74  53

0x00DADD78  54

x64 output

Address of the beginning: 0x000001D9E0FB4DE0

Row 0 at address 0x000001D9E0FB4DE0 pointing to 0x000001D9E0FB4E08

Row 1 at address 0x000001D9E0FB4DE8 pointing to 0x000001D9E0FB4E18

Row 2 at address 0x000001D9E0FB4DF0 pointing to 0x000001D9E0FB4E28

Row 3 at address 0x000001D9E0FB4DF8 pointing to 0x000001D9E0FB4E38

Row 4 at address 0x000001D9E0FB4E00 pointing to 0x000001D9E0FB4E48



0x000001D9E0FB4E08  11

0x000001D9E0FB4E0C  12

0x000001D9E0FB4E10  13

0x000001D9E0FB4E14  14

0x000001D9E0FB4E18  21

0x000001D9E0FB4E1C  22

0x000001D9E0FB4E20  23

0x000001D9E0FB4E24  24

0x000001D9E0FB4E28  31

0x000001D9E0FB4E2C  32

0x000001D9E0FB4E30  33

0x000001D9E0FB4E34  34

0x000001D9E0FB4E38  41

0x000001D9E0FB4E3C  42

0x000001D9E0FB4E40  43

0x000001D9E0FB4E44  44

0x000001D9E0FB4E48  51

0x000001D9E0FB4E4C  52

0x000001D9E0FB4E50  53

0x000001D9E0FB4E54  54
Community
  • 1
  • 1
Ilia
  • 331
  • 5
  • 12
  • 3
    Is there a good reason to do all this stuff when you could implement a class that represents a 2D array? – juanchopanza Jul 01 '16 at 10:41
  • 1
    Also, if you want your code to be reviewed, then the best place is: codereview.stackexchange.com – abhishek_naik Jul 01 '16 at 10:55
  • @juanchopanza, Yes, you could implement a class which will store array in continuous block of memory and have some operator overloads to access elements, but here I try to wisely create the simplest array possible. It is not necessarily an appliable solution, more like a theoretical one - who one should do it if he wants to. – Ilia Jul 01 '16 at 10:56
  • "the most straightforward way to allocate memory for a 2D array as first creating array of pointers" No, that's wide-spread but incorrect and bad practice. Thus this whole program doesn't make much sense, you are creating slow code with increased complexity for nothing gained. As far as C is concerned, your question is a duplicate of [How do I correctly set up, access, and free a multidimensional array in C?](http://stackoverflow.com/questions/12462615/how-do-i-correctly-set-up-access-and-free-a-multidimensional-array-in-c). In C++ it is similar: `new type[x][y]`. – Lundin Jul 01 '16 at 10:59
  • @Ilia *But using malloc and unsafe pointer casts is not a good C++ practice* -- Which is why [templates](http://stackoverflow.com/questions/21943621/how-to-create-a-contiguous-2d-array-in-c/21944048#21944048) are used. – PaulMcKenzie Jul 01 '16 at 11:03

1 Answers1

1

You've done a lot of work for nothing. You started with an incorrect statement:

the most straightforward way to allocate memory for a 2D array [is] first creating array of pointers and then for each pointer assign a separate block of memory for actual array contents.

This would be true in two situations:

  1. If the array was somehow different sizes for different rows (yours is not);
  2. If you had a very fragmented heap, so trying to allocate one (say) 1MB block was less likely to succeed than (say) 1,000 1kB blocks.

But you're malloc()ing one block of memory, so both 1. and 2. above don't apply.

There's a third reason that some people think that you need to do it:

  1. Passing a two-dimensional array requires that the number of columns is known at compile-time:

    void Legal(int a[][1000]);
    void Illegal(int a[][]);
    void PointerVersion(int *a[], int numCols);
    

    With the above:

    • Legal() works because the compiler can work out how to access a[1][0];
    • Illegal() doesn't work because the compiler can't figure out how to access a[1][0];
    • PointerVersion() is obvious - the number of columns is passed in at run-time.

The reason that you don't get warnings is that you're typecasting all over the place - especially with int **& (shudder!). There are definitely times when references-to-pointers-to-pointers are needed - this is not one of them!

Another way to look at your solution is with the following:

typedef int Array1D[1000];      // 1,000 ints in a 1-D array
typedef Array1D *Array2D[1000]; // 1,000 pointers to 1-D arrays - a 2-D array!(?)

The above shows that the "2-D array" is actually a 1-D array of pointers to 1-D arrays of ints. These need to be passed as int ** (a pointer to a pointer-to-int).

Don't. Just... don't. If you need a two-dimensional array of integers when you know the height and width, just bite the bullet and allocate the block of memory and assign it to a single int *:

int *array = (int *)malloc(m*n * sizeof(int));

array is now a pointer to an integer. But, it's also the pointer to the beginning of an array; or a 2-D array; or a 3-D array; or... And this is why #3 above is also unnecessary. Pass the int * around everywhere (and if necessary, as an int *&), and pass around the number of columns as well.

  • You don't lose the time taken to do the double dereference;
  • You don't lose any clarity in what you're doing.

Or, as others have mentioned, barrel the whole thing up in a class and hide everything.

John Burger
  • 3,662
  • 1
  • 13
  • 23