1

I came across this way of using a std::string to receive a buffer.

Here it is simplified:

error_enum_t  get_fn(char*, unsigned long, unsigned long*);

void getStringValuedAttribute(std::string &value)
{
  if (value.size() == 0) {
    value.resize(32);
  }

  unsigned long actual_size;
  get_fn(&value[0], value.size(), &actual_size));

  if (actual_size >= value.size()) {
    value.resize(actual_size + 1);
    get_fn(&value[0], value.size(), &actual_size);
  }
}

After some digging on repl.it, I see that &value[0] is type char *, which I guess makes sense, because value[0] would have to be char. But it seems like this is giving direct access to value's buffer. Is that all that is going on here, or is there more wizardry afoot?

I tried digging into the source of basic_string.h and I see _M_local_buf, but there is a ton of template action going on and this is not my strong suit.

If I had to wager a guess, value[] is leveraging operator [] overloading to get access to a pointer reference to the start of the internal buffer, which is compatible with char *, so the get_fn is able to treat it like a vanilla buffer.

Is my assessment correct? Is this a wise idiom, nay, is it even safe?

DeusXMachina
  • 1,239
  • 1
  • 18
  • 26
  • 2
    That's a hack. Instead, use [`.data()`](https://en.cppreference.com/w/cpp/string/basic_string/data) – scohe001 Aug 30 '19 at 21:19
  • 2
    "*it seems like this is giving direct access to value's buffer*" - that is exactly what it is doing. Though, technically, this wasn't officially legal until C++11, but it was common practice as most *implementations* allowed it. "*If I had to wager a guess, ... Is my assessment correct?*" - yes. – Remy Lebeau Aug 30 '19 at 22:52
  • 1
    @scohe001 It's not a hack and `.data()` returns `const char*` which you would be casting with `const_cast` and that is an awful hack. – Mirko Aug 30 '19 at 23:59
  • TIL: "starting in c++17, str.data() returns a char* instead of const char*" https://stackoverflow.com/a/54872471/3988037. Also, according to cplusplus.com, "Both string::data and string::c_str are synonyms and return the same value." as of C++11. – DeusXMachina Sep 03 '19 at 14:05

1 Answers1

1

this is giving direct access to value's buffer

Correct (C++11).

Correct in practice (C++99).

As commented by @remy-lebeau and explained in a (also mentioned) very similar question: before C++11 this was not standardized.

You can see in this reference that for C++98 std::string::data

Returns a pointer to an array that contains the same sequence of characters as the characters that make up the value of the string object.

So, in theory you could have a C++98 implementation that returns a copy of this std::string state. But in practice, as said, implementations allowed it to be the real string data.

While for C++11 std::string::data the differences are:

Returns a pointer to an array that contains a null-terminated sequence of characters (i.e., a C-string) representing the current value of the string object.

This array includes the same sequence of characters that make up the value of the string object plus an additional terminating null-character ('\0') at the end.

The pointer returned points to the internal array currently used by the string object to store the characters that conform its value.

Both string::data and string::c_str are synonyms and return the same value.

Now more consistent.

Is this a wise idiom, nay, is it even safe?

It is safe, from C++11 on it's a 100% safe.

I think, as they're basically the same, the wisest would be to use std::string::data because it's more legible and keep semantics ok.

Community
  • 1
  • 1
marcos assis
  • 388
  • 2
  • 14
  • 1
    This has been very enlightening to this here C++ greenhorn. Most of the stuff I deal with is ROS related, and enforcing `-std=c++11` minimum seems to be a matter of course in general. But it's good to know that the interface changes in subtle ways before 11. – DeusXMachina Sep 03 '19 at 14:07