2

I want to take in three arbitrary length buffers of doubles. Below is a short example

struct Data
{
  double *foo[3];
};

int main(void)
{
  double bar1[] = {1.0, 2.0, 3.0};
  double bar2[] = {1.0, 2.0, 3.0, 4.0};
  double bar3[] = {1.0, 2.0, 3.0, 4.0, 5.0};

  struct Data *data = (struct Data*)malloc(sizeof(struct Data));

  data->foo[0] = bar1;
  data->foo[1] = bar2;
  data->foo[2] = bar3;

  printf("%lf %lf %lf\n", data->foo[0][0], data->foo[0][1], data->foo[0][2]);
  printf("%lf %lf %lf %lf\n", data->foo[1][0], data->foo[1][1], 
  data->foo[1][2], data->foo[1][3]);
  printf("%lf %lf %lf %lf %lf\n", data->foo[2][0], data->foo[2][1], 
  data->foo[2][2], data->foo[2][3], data->foo[2][4]);

  return 0;
}

My concern is that if I malloc Data in the manner above I run the risk of corrupt data. If I allocate memory on the heap for an array of pointers to double buffers (or essentially an arbitrarily sized two-dimensional array of doubles) without knowing the size, is the data protected in any way? I feel like it runs the possibility of overwritten data. Am I correct in this thinking? This compiles and prints, but I'm not sure I trust it in a much larger scale implementation.

trincot
  • 317,000
  • 35
  • 244
  • 286
Talaria
  • 228
  • 1
  • 4
  • 18
  • When using a pointer in C, there is always the possibility of corruption. Even NUL-terminated strings are just a convention: when you have a `char *`, you can anywhere you want, forward or backwards in memory until the OS tells you you messed up, usually via a segfault. – Mad Physicist Nov 19 '15 at 16:44
  • 1
    You should not cast the result of `malloc()`. I don't have the link here but there are good reasons - it can hide warnings. – Tom Zych Nov 19 '15 at 16:44
  • 3
    Well, don't overwrite it! ;-). An idea would be to have not simple arrays as foo->Data members but to define another struct which has an int as first element keeping the length information. If that resembles a c++ vector then, well.. – Peter - Reinstate Monica Nov 19 '15 at 16:45
  • @TomZych I'm so glad somebody mentioned this. SO wouldn't be SO without it. Btw, is unwind on vacation? Did he name you deputy? – Peter - Reinstate Monica Nov 19 '15 at 16:46
  • @TomZych: http://stackoverflow.com/questions/605845/do-i-cast-the-result-of-malloc, http://stackoverflow.com/questions/20094394/why-do-we-cast-return-value-of-malloc, http://stackoverflow.com/questions/1565496/specifically-whats-dangerous-about-casting-the-result-of-malloc, http://stackoverflow.com/questions/953112/should-i-explicitly-cast-mallocs-return-value, http://stackoverflow.com/questions/1634153/why-do-we-need-to-cast-what-malloc-returns. That being said, you do cast in C++: http://stackoverflow.com/questions/3477741/why-does-c-require-a-cast-for-malloc-but-c-doesnt – Mad Physicist Nov 19 '15 at 16:47
  • 1
    @MadPhysicist There is just as much potential for data corruption without any pointers; it would just manifest itself later and possibly more subtle. – Peter - Reinstate Monica Nov 19 '15 at 16:47
  • @PeterSchneider. Agreed. I just used strings as an example because they appear to be the most harmless. Other pointers usually don't have a termination convention. – Mad Physicist Nov 19 '15 at 16:48
  • Hell, there is another Peter Schneider? wtf? – Peter - Reinstate Monica Nov 19 '15 at 16:48
  • @PeterSchneider ... I know that feeling – Peter Schneider Nov 19 '15 at 16:50
  • @PeterSchneider: what, is this considered some kind of broken-record recording? Sorry, didn't know :) – Tom Zych Nov 19 '15 at 16:52
  • @TomZych excellent point on casting. – Talaria Nov 19 '15 at 16:52
  • @TomZych .. look into the profile page: we _are_ to different persons. Casting allows compiling code as C and as C++ btw... – Peter Schneider Nov 19 '15 at 16:55
  • @MadPhysicist I'm not sure I follow... even if you allocate the appropriate amount of memory? – Talaria Nov 19 '15 at 16:56
  • @PeterSchneider: hard to control who it goes to when you have the same name :) – Tom Zych Nov 19 '15 at 16:57
  • 1
    @Talaria in general, the form of your code is not that useful. You have allocated a `struct Data` from dynamic memory, but you set the elements of member array `foo[]` to point to variables on the stack. If you wanted to return a pointer to the `struct Data` initialized in this way from some function, it would be no good as the pointers in `data->foo` would now be pointing to invalid memory. In general, you'd need to initialize each element of `foo[]` to point to an allocated chunk of dynamic memory, and then initialize those chunks of memory. – Ian Abbott Nov 19 '15 at 17:02
  • Cast issues aside... you are allocating space for 3 pointers only. You are not dealing with the actual "arbitrary length arrays" from a memory management standpoint by allocating these 3 pointers. This data must have it's own storage space. In your example this is managed for you and loaded on the stack when main() loads. I think you are confused as to the purpose of allocating these pointers. [Here's](http://gribblelab.org/CBootcamp/7_Memory_Stack_vs_Heap.html) and easy read that might help your understanding... – pedwards Nov 19 '15 at 17:06
  • @user1311571 Did this notify you? I didn't know this was possible. It's awful. – Peter - Reinstate Monica Nov 19 '15 at 17:10
  • @Talaria When you use a pointer, you generally have no real knowledge of how much was allocated. For example, if you call `strlen()` on a buffer with no terminating NUL, you are liable (but not guaranteed) to get a seg-fault because the search will just keep going out of bounds until it finds a NUL. – Mad Physicist Nov 19 '15 at 17:54
  • @MadPhysicist I follow what you're saying now. Thanks for the help – Talaria Nov 19 '15 at 18:04
  • Thanks to all. I learned a lot here. Much more than what I intended to learn. – Talaria Nov 19 '15 at 18:05
  • Sign of a good question. – Mad Physicist Nov 19 '15 at 18:18

2 Answers2

2

As long as you do not assign wrong values, there is no data corruption. You have to be aware, where the data lives and how long it's valid. For example:

/* !!!! broken code ahead !!!! */
struct Data
{
  double *foo[3];
};

void initData(struct Data* data) {
  double bar1[] = {1.0, 2.0, 3.0};
  double bar2[] = {1.0, 2.0, 3.0, 4.0};
  double bar3[] = {1.0, 2.0, 3.0, 4.0, 5.0};
  data->foo[0] = bar1;
  data->foo[1] = bar2;
  data->foo[2] = bar3;
}

int main(void)
{
  struct Data *data = (struct Data*)malloc(sizeof(struct Data));
  initData(data);

  printf("%lf %lf %lf\n", data->foo[0][0], data->foo[0][1], data->foo[0][2]);
  printf("%lf %lf %lf %lf\n", data->foo[1][0], data->foo[1][1], 
  data->foo[1][2], data->foo[1][3]);
  printf("%lf %lf %lf %lf %lf\n", data->foo[2][0], data->foo[2][1], 
  data->foo[2][2], data->foo[2][3], data->foo[2][4]);

  return 0;
}

This would be a bad idea:

  • data is heap-allocated and "lives" until you call free
  • bar1..3 is stack-allocated and only lives inside of initData()
  • data->foo points to bar1..3 and is only valid inside initData()
  • the printf-calls might work (haven't tested) but it is broken code

Getting this right is the hardest task with C. When you use linux for development, you should take a look into valgrind to catch those type of bugs (the one in my example is obvious but it can get realy hard)

Peter Schneider
  • 1,683
  • 12
  • 31
  • Interesting. Say data had a member `int my_variable`. Now, in `initData()`, if you had `int x = 5` followed by `data->my_variable = x` would be fine, right? It's a local variable on the stack, but would it be protected because malloc would have allocated for it? – Talaria Nov 19 '15 at 17:37
  • 1
    @Talaria : It would be fine, yes. But I'm not sure if I understand your explanation right.. `data->my_variable` belongs to the struct and is allocated with malloc. With `data->my_variable = x`, you copy the value 5 of `x` into `my_variable` and everything is okay. The value of `bar1` and `data->foo[0]` is the __adress__ of `{1.0, 2.0, 3.0}`. This adress is invalid as soon as `initData` returns. – Peter Schneider Nov 19 '15 at 17:48
  • 1
    That's what I thought. Thanks for all the help. – Talaria Nov 19 '15 at 17:51
1

Certainly the malloc() itself does not contribute to the risk of data corruption. Whatever risk there is would be at least as great if the struct in question were an automatic variable allocated on the stack.

What you seem really to be asking about is the data structure itself, and basically pointers in general. Yes, if you have a pointer then it is possible to attempt an invalid memory access via that pointer, outside the bounds of the object to which the pointer points. C provides no protection against such attempts; it addresses the issue by declaring that the behavior of a program that attempts such an action is undefined.

It is incumbent on the programmer to ensure that his program does not attempt such an action. For pointers to arrays, one generally approaches that problem either by keeping track separately of the length of the pointed-to array, or by marking the end of the array with a sentinel value that cannot appear as normal data.

John Bollinger
  • 160,171
  • 8
  • 81
  • 157