why std::vector and std::string use "for" loop to copy or move elements?Wouldn't this be bad for performance?
Recently, I am reading the stl source code of libcxx in llvm-libcxx.But I found that the std::vector and std::string use "for" loop to copy or move elements in the container. Wouldn't this be bad for performance compared to processing with some faster functions while the element is basic type, for example memcpy and memmove. Here is the code in std::vector
LIBCPP_INLINE_VISIBILITY static _LIBCPP_CONSTEXPR_SINCE_CXX20 char_type* copy(char_type* __s1, const char_type* __s2, size_t __n) {
if (!__libcpp_is_constant_evaluated()) {
_LIBCPP_ASSERT(__s2 < __s1 || __s2 >= __s1+__n, "char_traits::copy overlapped range");
}
char_type* __r = __s1;
for (; __n; --__n, ++__s1, ++__s2)
assign(*__s1, *__s2);
return __r;
}
I want to know why not implement a specialized version vector or string to accelerate the processing.Initially, I thought the compiler would optimize this part of code, but it doesn't.
I have test the performance between for-loop and memcpy
(I known which is accelerated by SSE or AVX instructions).
Here is the test code:
void func(char *p1, char *p2, int len) {
memcpy(p1, p2, len);
}
void func(char *p1, char *p2, int len, int f) {
for (int i = 0; i < len; i++) {
p1[i] = p2[i];
}
}
The Result is as follows
//(memcpy) loop for 100000000000 times, cost time 305 us
//(for-loop) loop for 100000000000 times, cost time 1967 us
I use compiler explorer to see the disassembly code. Obviously the compiler doesn't optimize. The compile flag "-Os" is common in project. enter image description here
Edit: I have retest using char_traits::copy.The performance is similar to the memcpy version. And the char_traits does have a specific version char_traits using the std::copy to copy elements.Sorry for not reading very carefully. I got it, Thank you guys.