19

I am working with 2 libraries. One takes in and returns std::strings while the other uses std::vector<unsigned char>s.

It would be good if I could steal the underlying arrays from std::string and std::vector<unsigned char> and be able to move them into each other without the excessive copying.

ATM I use something like:

const unsigned char* raw_memory =
    reinterpret_cast<const unsigned char*>(string_value.c_str()),
std::vector<unsigned char>(raw_memory, raw_memory + string_value.size();

And the other way:

std::string(
    reinterpret_cast<const char*>(&vector_value[0]),
    vector_value.size());

It'd be far better to be able to define a:

std::string move_into(std::vector<unsigned char>&&);
std::vector<unsigned char> move_into(std::string&&);
Ivaylo Strandjev
  • 69,226
  • 18
  • 123
  • 176
genjix
  • 852
  • 2
  • 11
  • 21

2 Answers2

15

You can use the initialization using iterators. Have a look here

EDIT: pasting the code so that you don't have to go to ideone. Still leaving the link so that you can play arround with the code

#include <iostream>
#include <string>
#include <vector>
using namespace std;

int main() {
        string a = "Hello world";
        vector<unsigned char> v(a.begin(), a.end());
        for (int i= 0 ;i<  v.size(); ++i) {
           cout << v[i] << endl;
        }
        string s(v.begin(), v.end());
        cout << s << endl;
        return 0;
}
Ivaylo Strandjev
  • 69,226
  • 18
  • 123
  • 176
  • 5
    Of course this isn't any different from his solution, regarding the neccessary copies (what his question is about) and thus doesn't answer his question in any way. It's much cleaner than his solution, though, but a comment would have sufficed for this. – Christian Rau May 04 '12 at 08:51
  • @Christian why much cleaner? `string& assign (const char* s, size_t n);` should do a memcpy, while `template string& assign (InputIterator first, InputIterator last); ` emulates it – Liviu May 20 '16 at 13:28
11

This is not possible.

The vector and string class do not provide way to steal from anything else than vector or string respectively. They are not meant to exchange content.

The problem issue is that vector and string may have widely different underlying representations. Typically in gcc for example, string use the oldish COW (Copy On Write) "optimization", which is widely different from the typical vector representation (usually just a triple of pointers/size_t attributes).

If you are dealing with raw bytes, blame the library that decided to put them into string, and refactor it if you can.

Otherwise: copy. The reinterpret_cast should not be necessary because char and unsigned char have implicit casts between them (and now char is often unsigned by default).

Matthieu M.
  • 287,565
  • 48
  • 449
  • 722
  • 1
    C++11 explicitly disallows copy on write, doesn't it? Unless they keep it under the "so long as it behaves as if we complied with the spec" law. I think small string optimization has been the way to go for a while. – Mahmoud Al-Qudsi May 04 '12 at 08:31