6

I have an array named A that contains random floats, with its size anywhere in the range from [10000, 1000000]. This array is a randomly generated argument to my function, and in order to operate on it I am trying to preemptively append/pad it with 10000 zeros in an efficient manner. I am worried that appending A's allocated memory will corrupt the heap, so instead I allocate new memory, memcopy A, and memset 10000 trailing floats in the new Array to 0.0.

void *ArrayPadder(float *A){
    int pad = 10000;
    float *Apadded = (float*)malloc((sizeof(A)+pad)*sizeof(float));
    memcpy(Apadded, A, sizeof(A));
    memset(Apadded+sizeof(A), 0.0, pad);
    return Apadded;
}

Can anyone suggest a more efficient way of accomplishing this?

EDIT: Apologies for the delay but I added some clarification. The reason I can't just pre-allocate the correct memory space (510000 floats) is because the array is actually of random size, containing random floats. I chose 500000 in an attempt to simplify the question, which is now fixed.

4 Answers4

6

Not really a more efficient way, but a more accurate way:

memset(Apadded+sizeof(A), 0, pad * sizeof(float));

as if the size of float is 4, your code only initializes the first pad / sizeof(float) = 10000 / 4 = 2500 elements

Note that I used 0 and not 0.0 as the second parameter, as memset takes an int and sets its (low byte) value to all of the bytes.

MByD
  • 135,866
  • 28
  • 264
  • 277
3

I am trying to efficiently append/pad it with 10000 zeros

Appending and padding are not necessarily the same thing. I assume you want to grow an existing chunk of memory by 10,000 elements.

Why don't you just allocate the correct size up front and be done with it?

#define BASE_SIZE 500000
#define PAD_SIZE  10000

/* ... */

float *data = malloc((BASE_SIZE + PAD_SIZE) * sizeof(float));
if(!data) {
    /* do something */
}

memset(data + BASE_SIZE, 0, PAD_SIZE * sizeof(float));
/* last PAD_SIZE elements are now 0 */

Also note that memset takes the number of bytes to set, not elements, so this:

memset(Apadded+sizeof(A), 0.0, pad);

Is wrong. Should be pad * sizeof(float). pad is the number of elements, sizeof(float) gives the number of bytes per element.

Ed S.
  • 122,712
  • 22
  • 185
  • 265
  • @KingsIndian: Well, It's certainly far more efficient than reallocating the entire thing, but fair enough, initializing all of it is unnecessary. I changed the example do a `memset` after allocation. – Ed S. Nov 01 '12 at 21:02
  • For basically the same reason that `memset(.., .., pad);` is wrong, `memset(.., 0.0, ..);` is wrong. It will work for accidentally for `0.0` but would not work for other floating-point values. – Pascal Cuoq Nov 01 '12 at 21:25
  • 1
    @PascalCuoq : It will work for other floating-point values because the argument type is `int` – there will be an implicit conversion, and possibly a warning about data truncation, but it will work. That said, using `0.0` is certainly misleading IMO. – ildjarn Nov 01 '12 at 21:26
  • @ildjarn `memset(.., f, ..);` with `f` of type `double` will **type-check**. “Type-check” is a far cry from “work”, especially in C. It will not initialize the additional elements to the `float` value most closely approximating `f`, which would be the only possible meaning of “work” in this context. – Pascal Cuoq Nov 01 '12 at 21:30
  • 1
    @PascalCuoq : How is that different from what I said? The floating-point value will be converted to an `int`, then the low-byte of that `int` will be used to initialize the memory. This is neither implementation-defined nor undefined behavior. – ildjarn Nov 01 '12 at 21:33
  • @ildjarn You said “it will work for other floating-point values”, which could easily be confused with “it will do what is intended if used with other floating-point values”, whereas what the OP intended with 0.0 and would have intended had he written 1.0 was that the additional elements be initialized to 0.0 (resp. 1.0), which they won't be in the later case. – Pascal Cuoq Nov 01 '12 at 21:43
  • @PascalCuoq : On a little-endian system `1.0` will also have the desired behavior, since `static_cast(1.0)` == `1`. ;-] I see what you mean, though – my wording could have been better; I merely meant it has well-defined behavior. :-] – ildjarn Nov 01 '12 at 21:50
  • @ildjarn On a little-endian, classic platform, with `1.0` you get each byte set to `1`, meaning that each float is initialized to `0x1.020202p-125`. http://ideone.com/5idWyG – Pascal Cuoq Nov 01 '12 at 21:57
  • @PascalCuoq : *sigh* I wasn't implying that the values would be valid floats, I meant that each byte would be initialized with the value `1`. If you're using `memset` to initialize anything other than bytes, you're doing it wrong (I thought that was understood at this point :-). – ildjarn Nov 01 '12 at 21:58
  • @ildjarn “I meant that each byte would be initialized with the value 1” If that is your definition of “work”, then you do not need a little-endian system for `memset(Apadded+sizeof(A), 1.0, pad);` to “work”. It will “work” on a big-endian system, no problem. – Pascal Cuoq Nov 01 '12 at 22:16
2

This depends how A is defined. If it's allocated on the heap, just use realloc():

Apadded = realloc(A, new_size);
Olaf Dietsche
  • 72,253
  • 8
  • 102
  • 198
  • Yes, this is true for [tag:c], cut since this question is also tagged [tag:c++], it is better to cast the return value. – Olaf Dietsche Nov 01 '12 at 21:13
  • 3
    @OlafDietsche: Not anymore, it was removed. People just like to tag random C questions with C++ for no good reason. – Ed S. Nov 01 '12 at 21:14
0

Jordan, there are a few issues with your code. You don't show how A is defined but your use of sizeof(A) in the malloc call must be wrong. To allocate the expanded memory, you need either malloc, calloc or realloc. realloc can only be used if A is pointer to memory area that was allocated (eg by malloc) on the heap.

float *Apadded = realloc(A, (number_of_entries_in_A + pad) * sizeof(float));

realloc will copy the contents of A to Apadded but will leave the padding area uninitialised. Note that realloc can fail, in which case A still exists and must later be freed.

If A was statically allocated, then you need malloc or calloc. calloc will initialise the memory to zero for you but since you are going to copy A to Apapped, it is wasteful. So malloc would be best.

float *Apadded = malloc((number_of_entries_in_A + pad) * sizeof(float));

Note that number_of_entries_in_A is not the same as sizeof(A). Note also, no cast in C.

To initialise the padding area, you can use memset with sizeof(A). But note that if A is a pointer this will fail as sizeof(A) will be just the size of the pointer, not the array.

memcpy(Apadded, A, sizeof(A));
memset(Apadded+sizeof(A), 0, pad);

Note also that, although it is probably so, there may be no guarantee that floating point 0.0 is represented by all zeros.

William Morris
  • 3,554
  • 2
  • 23
  • 24