A coworker wanted to write this:
std::string_view strip_whitespace(std::string_view sv);
std::string line = "hello ";
line = strip_whitespace(line);
I said that returning string_view
made me uneasy a priori, and furthermore, the aliasing here looked like UB to me.
I can say with certainty that line = strip_whitespace(line)
in this case is equivalent to line = std::string_view(line.data(), 5)
. I believe that will call string::operator=(const T&) [with T=string_view]
, which is defined to be equivalent to line.assign(const T&) [with T=string_view]
, which is defined to be equivalent to line.assign(line.data(), 5)
, which is defined to do this:
Preconditions: [s, s + n) is a valid range.
Effects: Replaces the string controlled by *this with a copy of the range [s, s + n).
Returns: *this.
But this doesn't say what happens when there's aliasing.
I asked this question on the cpplang Slack yesterday and got mixed answers. Looking for super authoritative answers here, and/or empirical analysis of real library vendors' implementations.
I wrote test cases for string::assign
, vector::assign
, deque::assign
, list::assign
, and forward_list::assign
.
- Libc++ makes all of these test cases work.
- Libstdc++ makes them all work except for
forward_list
, which segfaults. - I don't know about MSVC's library.
The segfault in libstdc++ gives me hope that this is UB; but I also see both libc++ and libstdc++ going to great effort to make this work at least in the common cases.