2

I found something strange about the memcpy() and memset() functions in MSVC2017 and I cannot explain it. Specifically, the 'destination' gets indexed not by single bytes but by the whole size of a structure ('size' argument).

So I have a struct:

typedef struct _S
{
    int x;
    int y;
} S;

And the code goes like follows:

S* array = (S*)malloc(sizeof(S) * 10); /* Ok. Allocates enough space for 10 structures. */

S s; /* some new structure instance */

/* !!! here is the problem. 
 * sizeof(S) will return 8
 * 8*1 = 8
 * now the starting address will be: array+8
 * so I'm expecting my structure 's' to be copied to
 * the second ''element'' of 'array' (index 1)
 * BUT in reality it will be copied to the 7th index! 
 */
memcpy(array + (sizeof(S) * 1), &s, sizeof(S));  

/* After some tests I found out how to access 'properly' the 
 * 'array': 
 */
memcpy(array + 1, &s, sizeof(S); /* this will leave the first struct
    in the 'array' unchanged and copy 's's contents to the second 
    element */

Same for memset(). So far I thought the indexing should be done manually, providing the size of the copied object as well, but no?

memcpy(destination + (size * offset), source + (size * offset), size)

Am I doing something wrong?

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
Nikolay P.
  • 33
  • 6
  • 3
    read about pointer arithmetic. If `destination` is `S*` you don't have to multiply by size – Jean-François Fabre Jan 07 '19 at 15:04
  • 4
    `array + (sizeof(S) * 1)` - is not doing what you think it is. Read about pointer arithmetic. You want `array + 1`. – Eugene Sh. Jan 07 '19 at 15:04
  • If we leave `memcpy` and `memset` out of the picture, are you familiar with how pointer arithmetic works? What do you expect `S* a2 = array + 2;` to do? – Angew is no longer proud of SO Jan 07 '19 at 15:05
  • 1
    `array + 8` does not add 8 to `array`. It adds the size of 8 `S`. In this case, that is like `(char*)array + 64` – chux - Reinstate Monica Jan 07 '19 at 15:06
  • "expecting my structure 's' to be copied to the second" --> Consider then instead `&array[1]` or `array + 1`. – chux - Reinstate Monica Jan 07 '19 at 15:09
  • From array=(sizeof(S) * 10) im expecting to get a pointer pointing to 80bytes long memory. lets say (int){0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...} if i copy to array + 0 lets say {5, 5} i expect the memory to become : (int){5, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...} . If i copy to array+8 i expect the memory to be copied in the 'middle' as : {0, 5, 5, 0, 0, 0, 0, 0, 0, ...} and last to answer your question if i copy to array+1 i expect the bytes of the structure to copied to the bytes of the array so the thing will become: {45645, 46546, 0, 0, 0, 0, 0, 0, 0, ...} or some random numbers – Nikolay P. Jan 07 '19 at 15:13
  • @chux yes &array[index] will do but my question was why when doing pointer aritmetic array+index takes into account the size of each element. – Nikolay P. Jan 07 '19 at 15:14
  • 2
    Most likely not an issue but the behaviour on declaring `_S` is undefined. – Bathsheba Jan 07 '19 at 15:18
  • 1
    Because these are the rules of pointer arithmetic as defined by the language. `&array[x]` is the very same as `array+x` (for `array` being pointer or array type). – Eugene Sh. Jan 07 '19 at 15:18
  • 1
    [Do I cast the result of malloc?](https://stackoverflow.com/q/605845/2173917) – Sourav Ghosh Jan 07 '19 at 15:20
  • @SouravGhosh MSVC gives me annoying warnings if I dont cast void* to my type – Nikolay P. Jan 07 '19 at 15:22
  • @EugeneSh. Related question: if i want to copy an int in the middle of 2 other ints how can I do it since memcpy(firstInt + 4, &myNewInt, 4) wont do the job? Do i have to cast my array to char* instead ? – Nikolay P. Jan 07 '19 at 15:24
  • 2
    Then you are not using C compiler, but C++. You might find other issues with that. – Eugene Sh. Jan 07 '19 at 15:24
  • 1
    I don't know what is *copy an int in the middle of 2 other ints*. You can't "push" elements aside and copy something in between. – Eugene Sh. Jan 07 '19 at 15:26
  • Best to use a C complier for C code rather than a C++ one. Nikolay P., do you have access to a C compiler? – chux - Reinstate Monica Jan 07 '19 at 15:26
  • @chux Yeah, It wasn't some serious project. I never write in C actually. – Nikolay P. Jan 07 '19 at 15:27
  • @EugeneSh. {0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00} <- 2 ints . By "in the middle" i mean something like: {0x00, 0x00, 0xFF, 0xFF, 0xFF, 0xFF, 0x00, 0x00} where {0xFF, 0xFF, 0xFF, 0xFF} is the 3rd int im copying from. – Nikolay P. Jan 07 '19 at 15:29
  • You will need to alias the array as an array of `char*` to do that. But note, that we are entering here the area where C and C++ might be *very* different. – Eugene Sh. Jan 07 '19 at 15:30
  • @EugeneSh. Yeah, this "workaround" you proposed will do the job. Can you elaborate on "the area where C and C++ might be very different" if you don't mind ? – Nikolay P. Jan 07 '19 at 15:33
  • 1
    regarding: `typedef struct _S` in C, an underscore, followed by a capital letter is 'reserved' for the system. Suggest using: `typedef struct S_` – user3629249 Jan 07 '19 at 16:10
  • 1
    Do not edit your question with an answer. Instead post you own answer below. Post rolled-back. – chux - Reinstate Monica Jan 07 '19 at 18:27

1 Answers1

2

memcpy and memset are not the culprits in this case. Your problem comes from a misunderstanding of pointer arithmetic.

When you add a number to a pointer, the pointer advances by that many elements, not by that many bytes. So if sizeof(S) == 8, then a pointer to an S is advanced by 8 bytes when adding 1 to it, by 16 bytes when adding 2, and so on. The point is that you can abstract the bytes (and therefore the element's size) out of the picture.

So if you allocated an array of three S elements, your memory might be laid out like so:

  • 3 x S
  • 6 x int
  • 18 x char (bytes)

You want to be able to ignore the bytes, and only access the x and y fields through an S, so that leaves S-sized blocks of memory.

|0              |1              |2              | array            : S *
|x       y      |x       y      |x       y      | S[N]->x, S[N]->y : int *
|0 1 2 3 4 5 6 7|0 1 2 3 4 5 6 7|0 1 2 3 4 5 6 7| individual bytes : char *
Billy Brown
  • 2,272
  • 23
  • 25
  • Yes, you are right. The confusion came from my assumption that a pointer is just a variable, whos value is a number representing an address. So when I say "pointer+1" I expected this value to be taken and incremented by one and not by 4 (int the case of int* and so on). – Nikolay P. Jan 07 '19 at 19:19
  • @NikolayP. a pointer is just an address in memory, but at compile-time, its type is known, so it is taken into account. It is incremented by the size of its underlying type during pointer arithmetic, but the size of the pointer itself depends on the architecture (32bit, 64bit, etc.). – Billy Brown Jan 09 '19 at 09:37