4

Suppose I have structure:

typedef struct {
double re;
double im;
}ComplexStruct;

and array:

typedef double ComplexArr[2];// [0] -> real, [1] -> imag

today I am copying from ComplexStruct to ComplexArr and vice versa using simple for loop:

//ComplexArr to  ComplexStruct
for (i = 0 ; i < NumEl; i++)
{
     ComplexStructVec[i].re = ComplexArrVec[i][0];
     ComplexStructVec[i].im = ComplexArrVec[i][1];
}

//ComplexStruct to ComplexArr
for (i = 0 ; i < NumEl; i++)
{
     ComplexArrVec[i][0] = ComplexStructVec[i].re;
     ComplexArrVec[i][1] = ComplexStructVec[i].im;
}

Is there a way to this safely using memcpy for at least one of the directions? Is there another way which will by faster than for loop?

alk
  • 69,737
  • 10
  • 105
  • 255
Benny K
  • 1,957
  • 18
  • 33
  • Doesn't answer your question, but see the suggestion [here](https://stackoverflow.com/a/2963928/133203). – Federico klez Culloca Sep 04 '19 at 11:58
  • 1
    While it's highly unlikely that there's padding in the structure, it's not guaranteed. That means you really shouldn't use `memcpy` to copy between the two arrays. But in the (likely) event that `sizeof(ComplexStruct) == sizeof(ComplexArr)` then it will work with `memcpy`. – Some programmer dude Sep 04 '19 at 11:59
  • And if you want to still be safe, and use an explicit loop, then consider *loop unrolling* as one way to quicken it up. Or way to do the copying in parallel. But whatever optimization you do, first make sure that it's truly a major bottle-neck in your program (measured with a compiler optimized build) and that you document your own optimizations thoroughly. – Some programmer dude Sep 04 '19 at 12:03

3 Answers3

3

The optimizer in your compiler should do a great job with that code, you don't need to change much to make it optimal. However, if you are passing ComplexStructVec and ComplexArrVec into a function, you should mark them as restrict so the compiler knows there is no aliasing going on. Like this:

void copy(ComplexStruct* restrict ComplexStructVec, const ComplexArr* ComplexArrVec)
{
    unsigned NumEl = 1000;
    for (unsigned i = 0 ; i < NumEl; i++)
    {
        ComplexStructVec[i].re = ComplexArrVec[i][0];
        ComplexStructVec[i].im = ComplexArrVec[i][1];
    }
}

By doing that you eliminate a whole bunch of generated code because it doesn't need to handle the possibility that the two arguments overlap.

Demo: https://godbolt.org/z/F3DUaq (just delete "restrict" there to see the difference). If NumEl is less than 18 it will unroll the whole thing into one load and one store per iteration.

John Zwinck
  • 239,568
  • 38
  • 324
  • 436
  • Pedantically, you should `restrict` qualify both pointers. Probably won't matter in this specific case. – Lundin Sep 04 '19 at 12:40
  • @Lundin: You can `restrict` both pointers if you want but it doesn't change anything vs using it on one as I did. Here are some details on that: https://stackoverflow.com/a/43671121/4323 . A layman's explanation (not directed at you): If there are two people in a room and one of them has a name which is unique, is there anyone in the room whose name is not unique? – John Zwinck Sep 05 '19 at 02:42
  • But in case there are other pointers present in the function, doing global variable access, things turn intricate. Because then the compiler cannot assume that the parameter you left without `restrict` doesn't alias with the globals. – Lundin Sep 05 '19 at 06:51
2

Yes, you can use memcpy, with a few caveats:

  1. The layout of the array and structure are identical, meaning that the compiler does not align either the items in the array or entries in the structure.
  2. The memory associated with the struct and array are identical in size.
  3. You are not concerned with portability to other architectures (which may change the answers to #1 and/or #2).
  4. This is not an ideal programming technique as it has some potential pitfalls, as noted above).

If after the above you still want to do this, the following code should do the trick:

/* NOTE: sizeof(ComplexStructVec) === sizeof(ComplexArrVec) */
memcpy((void *) ComplexStructVec,
       (void *) ComplexArrVec,
       sizeof(ComplexStructVec)*NumEl);

What this does is, since you are using vectors (arrays) in both cases, you have the address of them by using just their names. memcpy defines the destination and source addresses as void *, so I cast the arguments. The number of bytes to copy is the size, in bytes, of the structure or array (see NOTE) times the number of entries in the vector. The (void *) cast may not be required. It depends upon the compiler, the language standard level, and other compile time qualifiers.

Also note, I intentionally did not have a place for the return value, which is a pointer to the destination. If you want this information, be careful, as saving it to ComplexStructVec may cause either compiler (or worse case run-time) issues, depending on how it was allocated (by the compiler or during run-time).

A more complete example:

void copy(ComplexStruct* ComplexStructVec, ComplexArr* ComplexArrVec)
{
    unsigned NumEl = 1000;
    memcpy(ComplexStructVec, ComplexArrVec, sizeof(ComplexStruct)*NumEl);
}
JonBelanger
  • 150
  • 3
  • 12
  • OP's loop with `restrict` added as in my answer results in the same generated code as `memcpy` in many cases. There's no speedup from using `memcpy` if you compile with optimization enabled. – John Zwinck Sep 05 '19 at 02:46
  • 1
    Not entirely true. In this particular case, I'd agree. But, if you have an irregular structure, where there is embedded padding, then the optimizer needs to honor these things. That is why I believe it is important to organize your structures to minimize this padding. – JonBelanger Sep 14 '19 at 10:14
1

The most portable way is the loops as in your example. This is sometimes referred to as serializing/de-serializing of the struct.

The problem with structs is that they aren't guaranteed to have a consistent memory layout, like arrays are. To dodge alignment problems, the compiler is free to add padding bytes anywhere. In case of a struct consisting of nothing but 8 byte double, padding is highly unlikely. But still, formally it isn't portable.

You can however fairly safely do the following:

_Static_assert(sizeof(ComplexStruct) == sizeof(double[2]), 
                 "Weird systems not supported");

ComplexStruct cs;
double arr[2];
memcpy(&cs, arr, sizeof arr);
memcpy(arr, &cs, sizeof arr);

This is "reasonably portable" to all real-world systems.

Another option is to give the struct two different variable representations by adding a union, like this:

typedef union {
  struct // C11 anonymous struct
  {
    double re;
    double im;
  };
  double arr[2];
}ComplexStruct;

The inner struct may still have padding, so you should add a formal static assert still. But this allows you the flexibility to use the data contents either as individual members or as an array.

And finally, C actually has language support for complex numbers. double _Complex is standard C, and complex.h is a standardized complex library. See How to work with complex numbers in C?

Lundin
  • 195,001
  • 40
  • 254
  • 396