1

I have a bunch of old linux code which does something like this:

int size_of_buffer = /*stuff computed dynamically*/;
char *buffer = malloc(size_of_buffer);
recv(socket, buffer, size_of_buffer, 0);
//do some processing of the buffer as string
free(buffer);

When I was migrating it to C++ I changed it like this:

int size_of_buffer = /*stuff computed dynamically*/;
const auto buffer = make_unique<char[]>(size_of_buffer);
recv(socket, buffer.get(), size_of_buffer, 0);
const std::string str_buffer = buffer.get();
//do some processing on str_buffer

Which you can't fail to notice causes double memory allocation and potentially multiple copying of data. My idea now is to pass the pointer to first character of the std::string with reserved storage, like this:

int size_of_buffer = /*stuff computed dynamically*/;
std::string buffer;
buffer.reserve(size_of_buffer);
recv(socket, &(buffer[0]), size_of_buffer, 0);
//do some processing on buffer

Is above code safe and well defined or there are some caveats and dangers that need to be avoided?

bartop
  • 9,971
  • 1
  • 23
  • 54
  • i'd use '[buffer.data()](https://en.cppreference.com/w/cpp/string/basic_string/data)` – Thomas Nov 21 '19 at 12:16
  • @Thomas I have my doubts due to fact that `data()` is `const` – bartop Nov 21 '19 at 12:17
  • 3
    What c++-version are you on? – n314159 Nov 21 '19 at 12:18
  • 4
    @bartop There is a non-const version since C++17. However, `data()` only guarantees a valid range from `data()` (inclusive) to `data + size()` (exclusive according to cppreference, but inclusive according to [latest draft](http://eel.is/c++draft/basic.string#string.accessors-1)): https://en.cppreference.com/w/cpp/string/basic_string/data. `reserve()` alone won't help I fear. – Max Langhof Nov 21 '19 at 12:18
  • 3
    Since c++11, your `&(buffer[0])` is equivalent to `buffer.data()` see [here](https://en.cppreference.com/w/cpp/string/basic_string/data). Furhermore, "Modifying the character array accessed through the const overload of data has undefined behavior.", so if you are below C+17, what you are doing is UB regardless how you do it. – n314159 Nov 21 '19 at 12:20
  • @n314159 that sounds like an answer, anyway, how can I avoid coping data pre C++17? – bartop Nov 21 '19 at 12:21
  • [This](https://stackoverflow.com/questions/361500/initializing-stdstring-from-char-without-copy) may help you. (aka, I don't think it is possible). Maybe you could do something by providing a custom allocator, but that is not worth the trouble. If you provide more information on what you want to do with the string, maybe we can find a workaround. – n314159 Nov 21 '19 at 12:24
  • @MaxLanghof if I filled the string with some data, it would be valid or there still would be something wrong? – bartop Nov 21 '19 at 12:27
  • 4
    In c++17, if you `resize` the string, everything should be fine. (Or just construct it with the right size) – n314159 Nov 21 '19 at 12:27
  • 3
    Beware that `const std::string str_buffer = buffer.get()` will only work if `buffer` has a null character in it. The right way to do that would be with `const std::string str_buffer(buffer.get(), lengh_returned_by_recv)`. – Fernando Silveira Nov 21 '19 at 12:28
  • 1
    @n314159 I would be grateful if you compiled it into an answer, some people may have same problem and overlook the comments – bartop Nov 21 '19 at 12:29

1 Answers1

3

A similar question was asked here. The short answer is: it is not possible without copying.

Below C++17, there is no non-const overload of std::string::data(), and

1) Modifying the character array accessed through the const overload of data has undefined behavior.

Hence, you cannot modify the string through data.

Since C++11,

data() + i == std::addressof(operator[](i)) for every i in [0, size()].

Therefore, you also cannot modify the string through &(buffer[0]).

Before C+11, it is actually not very clear to me, what exactly is allowed, so maybe modifying through &(buffer.begin()) is ok, but I don't think so.

On cppreference, there is actually a quote that confounds me a bit

The elements of a basic_string are stored contiguously, that is, for a basic_string s, &*(s.begin() + n) == &*s.begin() + n for any n in [0, s.size()), or, equivalently, a pointer to s[0] can be passed to functions that expect a pointer to the first element of a (null-terminated (since C++11)) CharT[] array.

I think this means to const array, since otherwise it would not fit to the rest of the documentation and right now I do not have the time to go through the standard.

Since C++17, your code is ok, if you use std::string::resize instead of reserve, since data() only guarantees a valid range on [data(), data() + size()) (or you can just construct the string with the right size). There is no-non-copy way to create a string from a char *.

You can use a std::string_view, which has a constant constructor from char *. It does exactly what you want here, since it has no ownership on the pointer and memory.

n314159
  • 4,990
  • 1
  • 5
  • 20