Saving multidimensional arrays in C

Question

I have saved multidimensional arrays in Matlab before (e.g. an array A that has size 100x100x100) using a .mat file and that worked out very nicely.

What is the best way to save such multidimensional arrays in C? The only way I can think of is to store it as a 2D array (e.g. convert a KxNxM array to a KNxM array) and be careful in remembering how it was saved.

What is also desired is to save it in a way that can be opened later in Matlab for post-processing/plotting.

First I believe you need to ponder why you need to save it as a multi-array to begin with. Is this a raw memory image? Is it binary data? A bunch of integers? Floats? — Lundin, Nov 05 '12 at 15:43
For example, it would be nice if I can load the data in another program, written in either C or Matlab. I'd like to make the process or opening the data in another program as easy as possible. — db1234, Nov 05 '12 at 15:55

John Bode · Accepted Answer · 2012-11-05T19:24:51.210

C does 3D arrays just fine:

double data[D0][D1][D2];
...
data[i][j][k] = ...;

although for very large arrays such as in your example, you would want to allocate the arrays dynamically instead of declaring them as auto variables such as above, since the space for auto variables (usually the stack, but not always) may be very limited.

Assuming all your dimensions are known at compile time, you could do something like:

#include <stdlib.h>
...
#define DO 100
#define D1 100
#define D2 100
...
double (*data)[D1][D2] = malloc(sizeof *data * D0);
if (data)
{
  ...
  data[i][j][k] = ...;
  ...
  free(data);
}

This will allocate a D0xD1xD2 array from the heap, and you can access it like any regular 3D array.

If your dimensions are not known until run time, but you're working with a C99 compiler or a C2011 compiler that supports variable-length arrays, you can do something like this:

#include <stdlib.h>
...
size_t d0, d1, d2;
d0 = ...;
d1 = ...;
d2 = ...;
...
double (*data)[d1][d2] = malloc(sizeof *data * d0);
if (data)
{
  // same as above
}

If your dimensions are not known until runtime and you're working with a compiler that does not support variable-length arrays (C89 or earlier, or a C2011 compiler without VLA support), you'll need to take a different approach.

If the memory needs to be allocated contiguously, then you'll need to do something like the following:

size_t d0, d1, d2;
d0 = ...;
d1 = ...;
d2 = ...;
...
double *data = malloc(sizeof *data * d0 * d1 * d2);
if (data)
{
  ...
  data[i * d0 * d1 + j * d1 + k] = ...;
  ...
  free(data);
}

Note that you have to map your i, j, and k indices to a single index value.

If the memory doesn't need to be contiguous, you can do a piecemeal allocation like so:

double ***data;
...
data = malloc(d0 * sizeof *data);
if (data)
{
  size_t i;
  for (i = 0; i < d0; i++)
  {
    data[i] = malloc(d1 * sizeof *data[i]);
    if (data[i])
    {
      size_t j;
      for (j = 0; j < d1; j++)
      {
        data[i][j] = malloc(d2 * sizeof *data[i][j]);
        if (data[i][j])
        {
          size_t k;
          for (k = 0; k < d2; k++)
          {
            data[i][j][k] = initial_value();
          }
        }
      }
    }
  }
}

and deallocate it as

for (i = 0; i < d0; i++)
{
  for (j = 0; j < d1; j++)
  {
    free(data[i][j]);
  }
  free(data[i]);
}
free(data);

This is not recommended practice, btw; even though it allows you to index data as though it were a 3D array, the tradeoff is more complicated code, especially if malloc fails midway through the allocation loop (then you have to back out all the allocations you've made so far). It may also incur a performance penalty since the memory isn't guaranteed to be well-localized.

EDIT

As for saving this data in a file, it kind of depends on what you need to do.

The most portable is to save the data as formatted text, such as:

#include <stdio.h>
FILE *dat = fopen("myfile.dat", "w"); // opens new file for writing
if (dat)
{
  for (i = 0; i < D0; i++)
  {
    for (j = 0; j < D1; j++)
    {
      for (k = 0; k < D2; k++)
      {
        fprintf(dat, "%f ", data[i][j][k]);
      }
      fprintf(dat, "\n");
    }
    fprintf(dat, "\n");
  }
}

This writes the data out as a sequence of floating-point numbers, with a newline at the end of each row, and two newlines at the end of each "page". Reading the data back in is essentially the reverse:

FILE *dat = fopen("myfile.dat", "r"); // opens file for reading
if (dat)
{
  for (i = 0; i < D0; i++)
    for (j = 0; j < D1; j++)
      for (k = 0; k < D2; k++)
        fscanf(dat, "%f", &data[i][j][k]);
}

Note that both of these snippets assume that the array has a known, fixed size that does not change from run to run. If that is not the case, you will obviously have to store additional data in the file to determine how big the array needs to be. There's also nothing resembling error handling.

I'm leaving a lot of stuff out, since I'm not sure what your goal is.

Good to know all of this. My main interest* was to save the data to a file in a way that makes it easiest to use in another program. This is all useful for knowing how to better program when the array sizes are large. *I apologize that I may have confused people by originally using "store" when I should have been more precise and say "save". — db1234, Nov 05 '12 at 17:06

score 2 · Answer 2 · answered Nov 05 '12 at 15:42

2

Well, you can of course store it as a 3D array in C, too. Not sure why you feel you must convert to 2D:

double data[100][100][100];

This will of course require quite a bit of memory (around 7.6 MB assuming a 64-bit double), but that should be fine on a PC, for instance.

You might want to avoid putting such a variable on the stack, though.

answered Nov 05 '12 at 15:42

unwind

391,730
64
469
606

1

As you correctly point out, this should not be allocated on the stack, so this answer isn't really productive. On most platforms you would have to allocate such huge amounts of data on the heap, [see this](http://stackoverflow.com/questions/12462615/how-do-i-correctly-set-up-access-and-free-a-multidimensional-array-in-c). – Lundin Nov 05 '12 at 15:46
Memory is not a big issue, I have about 64 GB of RAM. The issue is that I want to save the array to a file with the intent of using it in another C or Matlab program – db1234 Nov 05 '12 at 15:57
1

@dblazevski: stack memory is still an issue regardless of how much RAM you have. – user7116 Nov 05 '12 at 16:23
OK, I was only vaguely familiar with stack memory (I do numerical simulations, not exactly a computer scientist per se) and thought a lot of RAM takes care of stack issues. So, then suppose I want to save the values of a scalar function defined on a 100x100x100 grid in 3D, what is a good way to do that? – db1234 Nov 05 '12 at 16:29
1

@dblazevski: RAM != stack != heap != virtual memory != the list goes on. Each thread is given a finite amount of stack space unless you request more when you compile/link your program. The more you request, the more memory each thread takes up (regardless of if it is used). In FORTRAN this is best seen with huge automatic arrays. Heaps typically can expand to fill all available virtual memory, and thus for larger data types it is better to keep them on the heap (even if only locally scoped). – user7116 Nov 05 '12 at 16:32
Again, everyone chooses if to allocate into the stack or the heap depending on the fact that it wants to resize or not the matrix.Consider that the stack is statistically faster than the heap, and that every application has it's needs.You can allocate it only on the heap, only or the stack or partially on the stack and partially on the heap, there isn't an universal correct way. – Ramy Al Zuhouri Nov 05 '12 at 17:56

Ben Voigt · Answer 3 · 2012-11-05T20:06:19.527

2

C handles multidimensional arrays (double array[K][M][N];) just fine, and they are stored contiguously in memory the same as a 1-D array. In fact, it's legal to write double* onedim = &array[0][0][0]; and then use the same exact memory area as both a 3-D and 1-D array.

To get it from C into matlab, you can just use fwrite(array, sizeof array[0][0][0], K*M*N*, fptr) in C and array = fread(fileID, inf, 'real*8') in MatLab. You may find that the reshape function is helpful.

edited Nov 05 '12 at 20:06

answered Nov 05 '12 at 15:42

Ben Voigt

277,958
43
419
720

Right, I have thought about that, and one has to be be careful about the indices and all, and was there was an easier less error-prone way out...especially if, say, I want to load the data into another C program, in which case writing my own version `reshape` may also help. – db1234 Nov 05 '12 at 16:00
@dblazevski Indeed you don't need your own version of `reshape`. In C (multi dim) arrays data are stored sequentially, thus, what you practically need to read/write is to offset a pointer (provided you don't want any transposition, of course). – Acorbe Nov 05 '12 at 16:17
@Ben Voigt what you suggested in the edited post seems potentially promising, but I still am confused at parts. For example, when you write `double* onedim = &array[0][0][0]`, I'm not sure what that exactly defines, especially the `[0][0][0]` part since I'm not sure what size array you're defining, if a size is even defined at all. Nor am I sure how to modify the entries of `onedim` as you've defined it. – db1234 Nov 05 '12 at 18:51
@dblazevski: Added the declaration. You know that the `[]` subscript brackets work just as well with a pointer as with an array, right? (In fact, the array is converted to a pointer when you use it with `[]`) – Ben Voigt Nov 05 '12 at 19:13
Oh, I see, you were assuming that `array` is the name of a variable that has been defined before.. Didn't catch that. Cool, I will try that tomorrow (it's almost 9pm where I live...). It seems all I have to do is write the contents of `onedim` into a file and open it in Matlab and as you say hopefully after using reshape I get what I want. This seems promising and perhaps less prone to errors than my original plan. – db1234 Nov 05 '12 at 19:39

score 2 · Answer 4 · answered Nov 05 '12 at 15:52

2

c can handle 3-dimensional arrays, so why don't use that?

Writing it on a .mat file is a little bit of work, but it doesn't seem too difficult.

The .mat format is described here.

answered Nov 05 '12 at 15:52

Klas Lindbäck

33,105
5
57
82

score -1 · Answer 5 · answered Nov 05 '12 at 15:45

-1

Triple pointer:

double*** X;
X= (double***)malloc(k*sizeof(double**));
for(int i=0; i<k;i++)
{
   X[i]=(double**)malloc(n*sizeof(double*));
   for(int j=0; j<n;j++)
   {
       X[i][j]=(double*)malloc(m*sizeof(double));
   }
}

This way the method to access at each value if quite intuitive: X[i][j][k].
If instead you want, you can use an unique array:

double* X;
X=(double*)malloc(n*m*k*sizeof(double));

And you access to each element this way:

X[i*n*m+j*n+k]=0.0;

If you use a triple pointer, don't forget to free the memory.

answered Nov 05 '12 at 15:45

Ramy Al Zuhouri

21,580
26
105
187

4

For the love of obfuscation, no! Don't do this. Learn about array pointers and the [correct ways](http://stackoverflow.com/questions/12462615/how-do-i-correctly-set-up-access-and-free-a-multidimensional-array-in-c) to allocate multi-dimensional arrays dynamically. – Lundin Nov 05 '12 at 15:48
@Lundin I think you're badly wrong: there isn't a reason because this method is "correct".The only thing is that the blocks of dimension 2 and 3 are static, and statistically you access faster to the stack, but the user may want to reshape the matrix like in MATLAB, and also, the method that I mentioned is correct as well. – Ramy Al Zuhouri Nov 05 '12 at 17:41
You aren't even allocating a matrix! You are allocating an array of pointers to pointers to pointer. The memory isn't allocated adjacently, it will be all over the heap. You _cannot_ use a would-be matrix like this with functions like memcpy, bsort, qsort and other such functions suitable for matrix handling. Apart from the link I already gave you, the [C FAQ](http://c-faq.com/aryptr/pass2dary.html) is also excellent reading. Now what you _should_ have done was to declare an array of array pointers, pointing to arrays of array pointers, pointing at arrays. With typedefs, for readability. – Lundin Nov 06 '12 at 20:22
Why not double X[k][n][m] then? Because you cannot resize the matrix.So the dynamic solution makes it resizable, like in MATLAB.Great way however, but not totally dynamic. – Ramy Al Zuhouri Nov 06 '12 at 23:26

Saving multidimensional arrays in C

5 Answers5