0

Is it ok to do the following?

auto len = GetWindowTextLengthA(min_area_edit_hwnd);
std::string buffer(len, '\0');
GetWindowTextA(min_area_edit_hwnd, &buffer[0], len + 1);

I mean, is it ok to tell the GetWindowTextA function that &buffer[0] points to the buffer with len + 1 size? Is it correct or should I create string in the following way

std::string buffer(len + 1, '\0');

?

M.M
  • 138,810
  • 21
  • 208
  • 365
FrozenHeart
  • 19,844
  • 33
  • 126
  • 242
  • Now the real question is: **Why** would you want to do that? Why not simply copy the window text into a `std::string` object, excluding the zero terminator? Plus, unrelated to your question, you should *really* consider calling the Unicode API, and use `std::wstring` instead. – IInspectable Jul 19 '17 at 10:56

6 Answers6

3

This is generally legal starting from C++11, but you do indeed have to reserve an extra character for the null terminator.

Before C++11 such uses of std::string were illegal. However, note that buffer.length() will not automatically update just because GetWindowTextA placed a \0 somewhere in the middle of its controlled sequence. If you want C-length of the string to be consistent with buffer.length(), it is your responsibility to resize() your buffer accordingly.

AnT stands with Russia
  • 312,472
  • 42
  • 525
  • 765
  • *uses of `std::function`* Did you mean `std::string`? – kirbyfan64sos Jul 19 '17 at 01:10
  • "or, alternatively, pass len instead of len + 1 to GetWindowTextA" -- I guess I cannot do that. The return value of the `GetWindowTextLength` function is the length of string **without** null terminating character while `GetWindowText` requires to pass a length **including** null terminating character – FrozenHeart Jul 19 '17 at 01:18
  • @FrozenHeart: You are right. I missed the fact that `len` is actually the return value of `GetWindowTextLengthA`. In that case you need `std::string buffer(len + 1, '\0');` and pass `len + 1` to `GetWindowTextA`. However, after that you will have to so `buffer.resize(len)` if you want `length()` to return a consistent value. – AnT stands with Russia Jul 19 '17 at 01:24
  • "but you do indeed have to reserve an extra character for the null terminator" -- are you sure? As far as I know, `std::string buffer(len, '\0');` will create a string with an internal null-terminating buffer of size `len + 1`, so it's perfectly normal to tell that `&buffer[0]` points to `len + 1` characters – FrozenHeart Jul 19 '17 at 01:25
  • @FrozenHeart: That's an interesting question. If I'm not mistaken, `std::string` spec does not require it to store the terminating `\0` explicitly, as an "extra" byte right at the end of the controlled sequence. Terminating zero could be implicit. (I could be wrong though... Have to check what C++17 says about that.) – AnT stands with Russia Jul 19 '17 at 01:27
  • Note, that http://en.cppreference.com/w/cpp/string/basic_string/data says that only `[data(); data() + size())` pointer range is valid. I.e. `data() + size()` is not a valid access point. And this is where that extra terminating `\0` would reside. – AnT stands with Russia Jul 19 '17 at 01:28
  • Also, from the same page: "Modifying the past-the-end null terminator stored at data()+size() has undefined behavior". – AnT stands with Russia Jul 19 '17 at 01:39
  • You can not modify the element at `data()+size()`. There even is an example, the implementation of strings at Facebook did not allocate that one eagerly which resulted in an obscure failure. [See here](https://youtu.be/kPR8h4-qZdk?t=19m16s) – HeroicKatora Jul 19 '17 at 01:40
  • @HeroicKatora: "lazily wrote" is rather different from allocation. Please see my answer for the proof that the terminator must be allocated eagerly, even though it can be written lazily. – Ben Voigt Jul 19 '17 at 01:55
  • @HeroicKatora: Awesome video though, thanks for sharing the link. – Ben Voigt Jul 19 '17 at 02:12
  • [string.accessors] in C++14 specifies that `data() + size() == &operator[](size())` – M.M Jul 19 '17 at 02:12
  • @M.M: Only after calling `data()`, until then `operator[](size())` can return a reference to "elsewhere". – Ben Voigt Jul 19 '17 at 02:13
  • @M.M: True. That would require contiguous storage for pointer returned by `data()`. Yet, `operator []` appears to be deliberately specified to allow non-contiguous storage location for `[size()]`. – AnT stands with Russia Jul 19 '17 at 02:18
1

I'd tend to err on the side of caution myself:

char cbuffer[MAX_LEN_OF_SOME_DESCRIPTION];
GetWindowTextA(..., cbuffer, ...);
cbuffer[len] = '\0';
std::string buffer(cbuffer); // assuming you actually NEED a C++ string.

This method will work without having to parse the rather voluminous C++ standards documents to find out whether writing directly into a string object is legal :-)

paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
0

The GetWindowTextA function will write into the provided buffer. In C++11 and C++14, it is undefined behaviour to write out of bounds of the string content of a std::string, even if you're writing a null terminator over the null terminator.

You will need to size the string to len + 1 before making the call, and then resize the string again afterwards, based on the return value.

M.M
  • 138,810
  • 21
  • 208
  • 365
  • From [24.3.2.5](http://eel.is/c++draft/string.access) - *"modifying the object to any value other than charT() leads to undefined behavior"* -- and `charT() == '\0'`, so it shouldn't be undefined behavior. – Benjamin Lindley Jul 19 '17 at 01:23
  • @BenjaminLindley the text you quote isn't in any published standard – M.M Jul 19 '17 at 01:30
  • Do you have a copy of a published standard? In all of the drafts I have access to from C++11 onwards, I can't find anywhere where it says that writing a null terminator to that position is UB. They all say that modifying it is UB. And writing the value that is already there is not modifying. – Benjamin Lindley Jul 19 '17 at 01:46
  • @BenjaminLindley I'm using N4140 which is the same text as the C++14 standard. Writing a value is modification, even if the value being written is the same as the old value. – M.M Jul 19 '17 at 01:50
0

From the standpoint of the standard, the following is guaranteed:

  • std::basic_string is a contiguous container
  • operator[](std::size_t) will not invalidate iterators
  • It is legal to read any element in the range [0, size()] and you may modify any such element except the one returned by operator[](size())

I would therefore conclude, that it is perfectly fine to do as long as the function does not eagerly write a null terminator (i.e. checks if it already ends with '\0' which it always will and then does nothing). Also, I have only heard of one instance of a string being implemented such that this would actively lead to real trouble, that is the implementation at Facebook.


Personally, I would go with the safer route and set the size to be a byte longer if you can not strictly control the access, so I would not pass a size of len + 1 .

Edit: from a comment under another answer, not correctly handling the (possibly not allocated) null terminator can result in real failures, see CppCon 2016: Nicholas Ormrod “The strange details of std::string at Facebook"

HeroicKatora
  • 958
  • 8
  • 17
  • I've never seen any function check the existing value of a string character before overwriting, so while there's the theoretical possibility that a function *could* see a `\0` already in place and skip writing, it's both unlikely and definitely not part of the documented contract. Debugging aids such as Electric Fence could definitely cause problems if any attempt is made to overwrite the terminator. – Ben Voigt Jul 19 '17 at 01:40
0

According to the Standard, this is not safe. The Standard goes to some considerable effort to explicitly state that writing to the NUL terminator is undefined behavior:

enter image description here

and

enter image description here


Slightly more difficult to analyze is whether &buffer[0] points to size() + 1 consecutive characters to begin with. The Standard does not make this guarantee explicitly, and in fact the wording of operator[] quoted above seems to leave the door open to having a buffer of only size() consecutive characters, with the terminator stored separately. However when all the semantic rules are considered, the string must actually have room for size() + 1 characters at all times. (Otherwise a call to c_str(), which is a const member function, could not return a pointer to a run of size()+1 characters without invalidating references, which it is not allowed to do. A second proof, also valid, is that c_str() cannot copy the content to a new location and still meet the complexity requirement).

However, it is not assured that the character at location size() + &buffer[0] is actually a NUL until c_str() or data() is called (at which point &buffer[size()] begins returning a reference to it instead of an alternate location).

Ben Voigt
  • 277,958
  • 43
  • 419
  • 720
-2

&(buffer[0]) has a type char*, and probably points to the text of length len, but even this is not guaranteed. You should use char buffer[len+1]; instead.

user31264
  • 6,557
  • 3
  • 26
  • 40
  • It is guaranteed by the C++ Standard that `&buffer[0]` points to a buffer of length `len`, with a subsequent null terminator. Also, `char buffer[len+1]` is not allowed in Standard C++. – M.M Jul 19 '17 at 01:13
  • It is guaranteed by the C++ Standard that &buffer[0] points to a buffer of length len, with a subsequent null terminator. - no, there is no such guaranty. – user31264 Jul 19 '17 at 01:17