4

So at the moment I have:

std::string a;
std::unique_ptr<char[]> b(std::make_unique<char[]>(a.size() + 1));
std::copy(std::begin(a), std::end(a), b.get());

Is it is possible to initialize this directly in one step?

JeJo
  • 30,635
  • 6
  • 49
  • 88
user3684792
  • 2,542
  • 2
  • 18
  • 23
  • 3
    Why not use `std::string` instead? You can use it as a `char[]` through `std::str::data()`. – Thomas Aug 10 '20 at 11:22
  • You can probably use `auto` to save from writing the type twice. – Nathan Chappell Aug 10 '20 at 11:24
  • 3
    ha ha, I knew someone would tell me to use std::string, as seem natrual. Unfortunately my original implementation for the problem I am working on did store a number of class members as strings, each of which would likely be at most a couple of characters long. A senior engineer took umbrage with the fact that sizeof(string) is 24bytes on our system, which could cause us to run out of memory. Hence the awkward shuffle into pointerland... – user3684792 Aug 10 '20 at 11:27
  • 3
    @user3684792: Inform the senior engineer that `new[]` has an invisible overhead for its internal datastructures. Also inform that engineer that `std::string` typically uses the small-string optimization which improves locality of reference and thus speed. `new[]` hurts the CPU cache. – MSalters Aug 10 '20 at 11:32
  • @Thomas [No it is not](https://godbolt.org/z/fcr3Wf). `std::vector` usually stores `begin`, `end`/`size` and `capacity_end`/`capacity`, thus 3 pointer widths, the same as `std::string`. – danielschemmel Aug 10 '20 at 11:36
  • @gha.st Must be the heat, thanks for the correction. – Thomas Aug 10 '20 at 11:40
  • @MSalters I agree with you both that this isn't an ideal solution, and it creates headaches using pointers to do this (especially unique_ptr which given its explicit ownership of memory needs to be moved around all over the place) but unfortunately it is what we have to do. The speed aspect here is less important than saving a few bytes. – user3684792 Aug 10 '20 at 11:44
  • 1
    This fills the data under `b` with zeroes before overwriting it. You may want to use a `new[]` to avoid that. (C++20 has `std::make_unique_for_overwrite`, too.) (Or maybe your compiler just optimizes it itself...) – HTNW Aug 10 '20 at 11:48
  • 1
    @user3684792: That's the first part of my comment: measure, don't assume. You senior engineer is more worried about the sizeof() he sees than the unknown and invisible overhead. Remember, `new char[]` has to be aligned for the largest type possible as you _could_ use if for placement new. The compiler won't know you're using it for strings. – MSalters Aug 10 '20 at 11:48
  • @MSalters Thanks for your comments. Is there a good resource to learn about this? – user3684792 Aug 10 '20 at 11:57
  • 1
    _The speed aspect here is less important than saving a few bytes._ You platform may vary, but on my platform every array allocation has an additional 32-byte overhead, plus 16-byte "alignment". So if you allocate `new char[1];` ten times, you'll see that each allocation is 16-bytes from each other (the 32-bytes overhead is located is a separate memory structure as part of the heap bookkeeping). Plus if underrun and overrun is enabled (which, granted, is a debug mode thing), there is another 8-bytes on each side of the allocation. If you REALLY need to manage bytes, use your own custom pool. – Eljay Aug 10 '20 at 12:46

2 Answers2

5

Is it is possible to initialize this directly in one step?

I would suggest keeping it as std::string or std::vector<char>.

However, if you really insist, Yes! Using an immediately invoking a lambda, this can be done.

std::unique_ptr<char[]> b = [&a]() {
   auto temp(std::make_unique<char[]>(a.size() + 1));
   std::copy(std::begin(a), std::end(a), temp.get());
   return temp;
}(); // invoke the lambda here!

The temp will be move constructed to the b.

(See a Demo)


If the string a will not be used later, you could move it to the std::unique_ptr<char[]>, using std::make_move_iterator.

#include <iterator>  // std::make_move_iterator

std::unique_ptr<char[]> b(std::make_unique<char[]>(a.size() + 1));
std::copy(std::make_move_iterator(std::begin(a)),
   std::make_move_iterator(std::end(a)), b.get());

If that needs to be in one step, pack it to the lambda like above.

JeJo
  • 30,635
  • 6
  • 49
  • 88
  • OK I will not do this, but it is good to know that there isn't an obvious cleaner method. – user3684792 Aug 10 '20 at 11:45
  • @user3684792 Of course, there are better options as others pointed out in the comment. Also if the string is not gonna use for latter, you could move to the `b` too. Maybe I will update with that. – JeJo Aug 10 '20 at 11:47
  • 2
    The lambda should not be `noexcept` since it allocates memory, right? – Quimby Aug 10 '20 at 11:54
  • 1
    The make move iterator is nice, and I will definitely use that – user3684792 Aug 10 '20 at 11:58
  • @Quimby Done. Thanks for pointing out. `noexcept` lambda is a kind of habit of mine. ‍♂️ – JeJo Aug 10 '20 at 11:59
  • 1
    @JeJo No problem, not a bad habit at all. – Quimby Aug 10 '20 at 12:00
  • How will the `move_iterator` help here? Those are `char`s you're moving from, not the `std::string` itself, right? cc @user3684792 – bogdan Aug 10 '20 at 16:04
1

Here's a variation using strdup and a custom deleter.

Note the use of char as first template parameter to std::unique_ptr rather than char[] since strdup will be giving back a char*.

The custom deleter is used to free memory rather than delete it, since strdup will use some flavor of malloc rather than new to allocate memory.

And you certainly don't need to use the typedef (or using, if you prefer) for CustomString here; it's just provided for sake of brevity.

#include <cstdlib>
#include <cstring>
#include <memory>
#include <string>

int main()
{
    // Some reusable utility code for a custom deleter to call 'free' instead of 'delete'.
    struct CustomDeleter
    {
        void operator()(char* const p) const
        {
            free(p);
        }
    };
    typedef std::unique_ptr<char, CustomDeleter> CustomString;

    const std::string a("whatever");
    // "Concise" one step initialization...
    const CustomString b(_strdup(a.c_str()));

    return 0;
}

Not advocating this as an ideal way to do this, but wanted to share as "a way" to do the one step initialization you asked for.

Phil Brubaker
  • 1,257
  • 3
  • 11
  • 14