3

Could you explain what the next two lines do?

// Line 1
*(long *)add1= *(long *)add2;
// Line 2
*(int *)add1 = *(int *)add2;

Edit 1. I add the complete block of code of the function I test. PS It is only a part of the skript. [original Skript]

It is a custom memcpy function to copy block of memory from A to B.


    (2)
    while(word_left > 0){
        *(char *)temp++ = *(char *)src++;
        word_left --;
    }

    return (void *)dest;
}

Mr Vlag was so kind and explained what the lines in the part(1) mean.

My question is regarding to the part(2). This part should solve a problem if the memory address overlaps.

Why do we use char here? Why can we not use long or int?

  • The syntax of `*` is `*X`, where in this case `X` is `(long *)add1` . The `*` is applied to `(long *)add1`, there is no `*(long *)` syntax element – M.M Sep 06 '21 at 08:18
  • I think you need to share a complete block of code. this is not informative. Have add1 and add2 been declared before? – Raha Moosavi Sep 06 '21 at 08:35
  • Please note that code like this contains undefined behavior and therefore cannot be compiled with an ordinary standard C compiler. Library code implementing memcpy doesn't (have to) follow the C standard. – Lundin Sep 06 '21 at 09:06
  • Also `int cpu_size = sizeof(char *);` is plain wrong... the address bus width has nothing to do with the data register size of the CPU. – Lundin Sep 06 '21 at 09:09
  • src and dest may not point to int or long aligned memory. All memory is assumed by C to be char-aligned. – stark Sep 06 '21 at 10:50
  • Note that to deal with overlapping buffers you must test if `src > dest` and do opposite directions based on the result. – stark Sep 06 '21 at 10:55

2 Answers2

1

In the both records const6ructions ( long * ) and ( int * ) mean casting pointers correspondingly to pointer types long * and int *. Then these pointers are dereferenced *( long * )addr2 and *( int * )addr2 to get access to pointed objects values of which are assigned to other objects that are obtained also through casting and dereferencing pointers.

To make it more clear consider a demonstrative program.

#include <stdio.h>

int main(void) 
{
    int x = 10;
    int y = 0;
    
    printf( "x = %d, y = %d\n", x, y );

    void *p1 = &x;
    void *p2 = &y;
    
    *( int * )p2 = *( int * )p1;
    
    printf( "x = %d, y = %d\n", x, y );
    
    return 0;
}

The program output is

x = 10, y = 0
x = 10, y = 10

In this program for example the pointer p1 of the type void * due to the casting ( int * )p1 is interpreted as a pointer of the type int * that points to an object of the type int (in this case it points to the object x). Now dereferencing this pointer *( int * )p1 you get a direct access to the pointed object x.

You may not just write for example

y = *p1;

because you may not dereference a void pointer because the type void is an incomplete type. So the compiler will not know how to interpret the pointed memory.

Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
  • Dear Vlad, thank you for taking you time and your reply. Sorry I dont understand your explanation...could you provide more information or an example? – Adil.Kolenko Sep 06 '21 at 08:17
  • @Adil.Kolenko See ny appended post. – Vlad from Moscow Sep 06 '21 at 08:22
  • Doesn't this `*(long *)temp = *(long *)src` break the aliasing rule? – chqrlie Sep 06 '21 at 09:01
  • @chqrlie There is no problem. Otherwise the function qsort would be invalid. – Vlad from Moscow Sep 06 '21 at 10:35
  • My question is more subtle than it appears. For example `int test(void) { short temp[8] = { 0 }; *(long *)temp = -1; return temp[0]; }` defines a function `test()` that can return `0` or `-1` depending on the compiler. **gcc** warns about this with `warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]`. The safe way to avoid this pitfall is to use `memcpy(temp, src, sizeof(long))` instead of `*(long *)temp = *(long *)src` as Lundin explains in his answer. – chqrlie Sep 06 '21 at 17:54
  • @chqrlie Does the compiler show a warning for the demonstrative program in my answer? – Vlad from Moscow Sep 06 '21 at 18:02
  • No, because in your program `x` and `y` have type `void *`. But it does for a slightly modified version, copying 2 bytes from a 4 byte `int`: https://godbolt.org/z/oTEcsTvYj . Your answer does explain the observed behavior but the C Standard does not guarantee this behavior in the general case. Warning the OP about this problem seems preferable: `*(long *)temp = *(long *)src` breaks the strict aliasing rule. – chqrlie Sep 06 '21 at 23:35
  • @chqrlie The question was updated with a new code after I had answered the question.:) – Vlad from Moscow Sep 07 '21 at 12:56
1

The C standard function memcpy is sometimes implemented in a similar manner like this. This function has no requirement that the addresses passed by the application are aligned.

A naive version could just run a for loop using char byte type and copy everything byte by byte. Then we don't have to worry about alignment, but such code will be slow since it isn't taking advantage of the CPU data width.

More efficient code will do the copying on the largest data size that the CPU can handle in a single instruction, such as for example 32 or 64 bits. This is supposedly what this code is supposed to do. If we do that, we still have to take care of potential misalignment in the start and trailing bytes in the end of the segment to be copied. That part has to be copied byte by byte, similar to the code at the end of your function.

This is the first place where we notice that the code you posted is severely broken - it doesn't handle initial misalignment.

Worse yet, it assumes that int cpu_size = sizeof(char *); gives the CPU data width size which is just plain wrong - the size of a pointer corresponds to the address bus width which is not the same thing as the maximum data register width on a whole lot of existing systems.

Another problem/bug is that temp += cpu_size; isn't valid C code but a non-standard gcc extension. We can't do pointer arithmetic on void pointers.

Cosmetic bugs are the casts between void* and void*. Obviously we don't need to cast between the same types. Every object pointer in C can in fact get implicitly converted to a void* without casts, given that qualifiers (const etc) match.

And finally, we can't run code such as this on a standard C compiler, because de-referencing some unknown data with a value access of long or int is very likely undefined behavior and a strict aliasing violation. What is the strict aliasing rule? The actual memcpy function as part of the standard lib isn't written in standard C and can't get compiled as it. (It is quite likely written in assembler and often inlined in the calling code.)

So what you should do with this code here is to delete it and forget that you ever saw it, because there's nothing to learn from it. The person who wrote it didn't know what they were doing. With the "Yoda conditions" obfuscation it looks like code from the 1980s - if so, then I'd recommend to avoid studying really old code like that.

Lundin
  • 195,001
  • 40
  • 254
  • 396
  • Dear Lundin, thank you for your reply. I have found this block [here] (https://github.com/gukai/test/blob/master/c_standard_libary/string/memcpy.c). It is a post from 2013, is it old? – Adil.Kolenko Sep 06 '21 at 09:34
  • @Adil.Kolenko Any random person can upload anything on Github. This code was written by some beginner. It's not something you should study. – Lundin Sep 06 '21 at 09:36