7

this is quite a simple question, however I'm finding it tricky. I want to treat a char* as if it were a std::string, for instance:

    char *p = ...; // read a huge chuck from a file

    std::string s(p); // this is not what I want

So, if I use the constructor, I get a copy of p, which is a waste of memory and time. Is it possible somehow to avoid this, and "assign" the std::string content to a pre-existing address?

Any other idea is more than welcome!

Thanks!

senseiwa
  • 2,369
  • 3
  • 24
  • 47

5 Answers5

14

Is it possible somehow to avoid this, and "assign" the std::string content to a pre-existing address?

No.

However, you can assign it to a std::string_view. Going forward, all uses of std::string except to own memory should be replaced by std::string_view.

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
2

std::string does not and cannot possibly support this, as it owns the string buffer.

Which means that it will eventually have to free the memory, or reallocate it in case you change the string to have a different length. If not earlier, then it must do so when the program exits.

Now, what is string supposed to do with some unknown block of memory that it got via a pointer? Is this memory allocated on the heap or on the stack, or maybe readonly memory from the data segment? There is nothing string could possibly do that is valid and won't either leak or cause a crash in one or the other situation.

Damon
  • 67,688
  • 20
  • 135
  • 185
  • So, as `std::string` cannot do that, is `std::vector` a viable option? Of course I will need to write some additional functions, but if it's faster (by not copying) I would do that. – senseiwa Oct 11 '13 at 14:09
  • The same is true for `vector`, so no. Vector allocates, reallocates, owns, and eventually frees the store. Therefore it, too, has to copy the data in your `char*` buffer into its own store, there's no other way. It's not possible to own or free (or reallocate) a buffer if you don't know how it was allocated or whether it can be freed at all. Calling `allocator::deallocate` or `operator delete` for that matter (or `free`, which is usually the underlying mechanism) on something that wasn't allocated in the same way almost certainly causes a crash (freeing NULL being the only notable exception). – Damon Oct 11 '13 at 20:37
  • Thanks, that clarified my doubts a lot. – senseiwa Oct 15 '13 at 09:43
  • Why can't we do it the other way around? That is, if I know the length of the string to be read from file upfront, maybe I can resize the string to that value (to make sure it has got the proper amount of memory allocated internally) and then somehow ask the string for the pointer to its internal buffer to read the data directly into it? Inb4 calling it "unsafe" or something: it is not much safer than reading these data byte-by-byte and assigning to the string with its [] operator in a loop (just less cumbersome). – SasQ Jun 11 '15 at 06:33
1

No, because the std::string generally expects more than a char* can provide, most notably reallocation of the storage into totally different place of memory. Also a std::string isn't guaranteed to be null-terminated, it just provides a begin() and an end().

But note that std::string and char[] have quite common interfaces:

  • you can index them with numbers and obtain chars,
  • you can call std::begin and std::end of them and get random access iterators, so that algorithms like sort can operate on them freely.

That's the core of C++ standard template library- containers and algorithms are separate, and the same algorithm can operate on a std::string and a char[].

Of course char* isn't char[], but OTOH a pair of char*s looks exactly like begin(char[]) and end(char[]), so that enough allows you to connect it to STL utilities that work in term of random access iterators.

Kos
  • 70,399
  • 25
  • 169
  • 233
1

No the wide-spreading std::string implementations do not implement a such feature. Even using the placement new() because the internal members can change from one implementation to another, or from one version to another, or depending on some #define... There is also the option to provide your own std::string allocator, but this does not seem to be the way to deal with this kind of issue...

Yes some string implementations allow buffer reuse as the RFA_string from Reuters Foundation API.

This idea has already been treated on some other questions/answers:

Moreover, there is also the rope data structure as for SGI STL used by boost...

Community
  • 1
  • 1
oHo
  • 51,447
  • 27
  • 165
  • 200
-3

Why not use std::vector<char> ? For example:

std::vector<char> data;
data.resize( size );  // resize this to how much you need 

char* p = &data[0];

// now you have a pointer to the internal data in std::vector 
Björn Pollex
  • 75,346
  • 28
  • 201
  • 283
Mihai Sebea
  • 398
  • 2
  • 10