I'm trying to implement simple C/C++ language compiler dependent memcpy function.
What are the possible erroneous results of this simple memory copy mechanism?
(Excluding null check and the case of copying from/to same address)
The mechanism is as follows.
(1) Define structs vary by size.
struct S100000000Byte
{
unsigned char pData[100000000];
};
(2) Cast void* argument to struct pointer, dereference the casted argument pointer, and copy.
__forceinline void Copy(void* _pBuffer1, void* _pBuffer0, const unsigned long long _ullSize)
{
...
(*static_cast<S100000000Byte*>(_pBuffer1)) = (*static_cast<S100000000Byte*>(_pBuffer0));
}
Test was taken by copying buffer size of 100000000 bytes 1000 times each.
This mechanism resulted to be faster than memcpy in the following condition.
CPU : i7-10875H
IDE : Visual Studio 2019
Platform Toolset : Visual Studio 2019 (v142)
Optimization : /O2
Time performance difference was
memcpy : 6609 ms Implemented Copy : 6547 ms
memcpy : 7062 ms Implemented Copy : 6766 ms
so far.
Disassembly of mechanism (2) was
rep movs byte ptr [rdi],byte ptr [rsi]
References were
https://www.geeksforgeeks.org/write-memcpy/
Implementing own memcpy (size in bytes?)
and articles referring about rep movsb.