14

I have std::string variable. And I need to put some bytes from array of unsigned chars to it. I know the first byte and the the legth.

I can use the std::string::assign function. I've done it.

But I want to solve that issue in a right way using the memcpy function.

std::string newString;
memcpy(&newString, &bytes[startIndex], length);

I know it is wrong. I have researched and found some ideas using std::vector.

Please help me to find the most elegant solution for this issue.

Oleksandr Balabanov
  • 629
  • 1
  • 6
  • 30

3 Answers3

24

Since we're just constructing the string, there is a std::string constructor that takes two iterators:

template< class InputIt >
basic_string( InputIt first, InputIt last, 
              const Allocator& alloc = Allocator() );

which we can provide:

std::string newString(&bytes[startIndex], &bytes[startIndex] + length);

If we're not constructing the string and are instead assigning to an existing one, you should still prefer to use assign(). That is precisely what that function is for:

oldString.assign(&bytes[startIndex], &bytes[startIndex] + length);

But if you really insist on memcpy() for some reason, then you need to make sure that the string actually has sufficient data to be copied into. And then just copy into it using &str[0] as the destination address:

oldString.resize(length); // make sure we have enough space!
memcpy(&oldString[0], &bytes[startIndex], length);

Pre-C++11 there is technically no guarantee that strings are stored in memory contiguously, though in practice this was done anyway.

Barry
  • 286,269
  • 29
  • 621
  • 977
  • IDK. looks like someone didn't like any of the answers – NathanOliver Sep 25 '15 at 17:32
  • I think `&bytes[startIndex + length]` would be more consistent with what you have (no programmer-performed pointer arithmetic). But there's also the string ctor taking a `const char*` and a length: `std::string newString(&bytes[startIndex], length);` – Ryan Haining Sep 25 '15 at 17:39
  • @RyanHaining OP had said that `bytes` was `unsigned char`, in which case that constructor wouldn't apply. For the other point, YMMV. – Barry Sep 25 '15 at 17:44
  • missed the `unsigned`. casting to `const char*` would seem legal in this case, meh. – Ryan Haining Sep 25 '15 at 17:48
  • 1
    @RyanHaining &bytes[startIndex + length] can result in an overflow, because you *access* the element at [startIndex + length] before taking it's address. Imagine a vector {'T'}: [startIndex (0) + length (1)] would access the element at index 1 – which is not there. – K. Biermann Jan 22 '17 at 19:01
  • If `string::assign` performs element-wise assignment, then you run into the issue of `unsigned char` -> `char` conversions which are implementation-defined when `char` is signed. At least with `std::memcpy` you know the bit patterns will be preserved. I'm hoping I'm wrong, because I'm encountering a similar issue when copying UTF-8 bytes to `std::string`, the latter still being used heavily in APIs until `std::u8string` becomes mainstream. – Emile Cormier Feb 16 '20 at 23:14
  • @EmileCormier Accessing `s[0]` on an empty string is fine, you get back `\0`. I rolled back your edits. – Barry Feb 18 '20 at 14:02
  • @Barry Oops, forgot about the automatic `\0` terminating character! :-) – Emile Cormier Feb 18 '20 at 18:39
  • I'm a rookie in c++, I have some question that memcpy with string . Are you sure that ``` memcpy(&oldString[0], &bytes[startIndex], length);``` is a allowed function by c++? Do you consider compatibility with gnu/g++ and so on . or diff version with c++ ? @Barry – zhaozheng Apr 16 '20 at 03:12
  • In other words, Do you guarantee string has contiguous storage?@Barry – zhaozheng Apr 16 '20 at 03:22
  • @zhaozheng `std::string` guarantees contiguous storage, yes. – Barry Apr 16 '20 at 13:39
1

You need to set the size of the string so that there will be a properly sized buffer to receive the data, and cast the constness out of the pointer you get from data()

std::string newString;
newString.resize(length);
memcpy((char*)newString.data(), &bytes[startIndex], length);

of course all of this is in the realm of undefined behavior, but pretty standard non the less.

shoosh
  • 76,898
  • 55
  • 205
  • 325
-1

It's a hack and as you said wrong way but it is possible since STL guarantees that std::string has contiguous storage:

std::string str(32, '\0');
std::strcpy(const_cast<char*>(str.data()), "REALLY DUDE, IT'S ILLEGAL WAY");

Of course, you can use std::memcpy in the same way (I used strcpy just to copy null-terminated string)...

In your case:

str.resize(length);
memcpy(const_cast<char*>(str.data()), bytes + startIndex, length);
Nevermore
  • 1,127
  • 9
  • 12
  • And if the string you're copying has more than 32 bytes? – Barry Sep 25 '15 at 17:35
  • 1
    Of course, you should 'presize' well fit string (pay attention, `resize()` not `reserve()`!) – Nevermore Sep 25 '15 at 17:37
  • 3
    to clarify what Nevermore means by "illegal way": [*Modifying the character array accessed through data is undefined behavior*](http://en.cppreference.com/w/cpp/string/basic_string/data). Don't do this – Ryan Haining Sep 25 '15 at 17:43
  • from `STL guarantees that std::string has contiguous storage` does not follow you can modify what `data()` returns – Slava Sep 25 '15 at 17:59
  • 1
    from `does not follow you can modify what data() returns` actually doesn't follow I can not modify what `data()` returns. This issue **has no legal solution** and I **pointed** that it's a hack. Do you wanna see another hack like `&front()` instead of `data()` or what? – Nevermore Sep 25 '15 at 18:09