1

I discovered that the sizeof(string) is 28B for VC and 32B for GCC respectively(the data part allocated on the heap is not included). This makes me very skeptic about the efficiency of using string as a parameter of a c++ constructor. So I conducted a little experiment as below:

class Foo
{
    static int inc;
    int val;
    int id;
public:
    Foo(int value) :val(value), id(Foo::inc++) {
        std::cout << to_string() << " being constructed\n";
    }
    ~Foo() {
        std::cout << to_string() << " being destroyed\n";
    };
    Foo(Foo const &other) :Foo(other.val) {
        std::cout << other.to_string() << " being copied\n";
    }
    Foo& operator=(Foo const &other) {
        val = other.val;
        std::cout << other.to_string() << " being copied\n";
    }
    Foo(Foo &&other) :Foo(other.val) {
        std::cout << other.to_string() << " being moved\n";
    }
    Foo& operator=(Foo &&other) {
        val = other.val;
        std::cout << other.to_string() << " being moved\n";
    }
    int value() const noexcept { return val; }
    std::string to_string() const {
        return std::string("Foo_") + std::to_string(id) + "_(" + std::to_string(val) + ")";
    }

};
int Foo::inc = 0;


struct Bar {
    Bar(Foo const &foo) :val(foo) {}
    Bar(Foo &&foo) :val(std::move(foo)) {}
private:
    Foo val;
};

//------- run it ------
Bar bar {42};

This is the result I got:

Foo_0_(42) being constructed
Foo_1_(42) being constructed
Foo_0_(42) being moved
Foo_0_(42) being destroyed
Foo_1_(42) being destroyed

Obviously, a temporary Foo instance was created on the spot of the argument of the ctor. So I suppose when I pass a const char* to a constructor that expects a string, the same procedure would happen, i.d. a temp str >= 28/32 bytes would be created and dropped immediately after being moved/copied. That kind of cost just makes me uncomfortable. Don't get me wrong, I will use a string as a data member of the class, it's only the formal argument type of the ctor that worries me.

Anyhow, if I replace the string parameter with a const char*, given that I can always resort to string::c_str(), I assume I would never need to pay such a cost, would I? I'd like to hear your perspective. If there is anything wrong with my analysis, please just point it out. If there is some mechanism which can eliminate the cost of the temp string, please teach me how would it work. If the temp string is inevitable, do I have to overload the ctor with both const char* and string&&, one to avoid temp string and one for rvalue string to avoid deep-copy? Thanks in advance!

Need4Steed
  • 2,170
  • 3
  • 22
  • 30
  • 1
    Is this the performance bottleneck in your program? – melpomene May 06 '17 at 08:39
  • 1
    It's more likely to make a mistake with raw pointers than it is for `string` to cause you significant performance problems. – Weak to Enuma Elish May 06 '17 at 08:43
  • 2
    Use `std::string` as a parameter and `std::move` it into your class member. The `sizeof` you're seeing is the size that will be allocated on the stack, and has nothing to do with the heap. Stack allocating is extremely fast, and shouldn't be of any concern to you – Steve Lorimer May 06 '17 at 08:48
  • 4
    Things that are wrong with your analysis: 1.) You disregard compiler optimisations that may completely eliminate moves or copies, 2.) You do not consider `std::string_view` or `std::string const&`, 3.) You do not consider SSO (Small-String Optimisation), 4.) You invest time and ressources into some isolated tiny piece of code which may well have no measurable effect *at all* on an application's performance, especially since no I/O is involved. Do you need this for some kind of public library? – Christian Hackl May 06 '17 at 08:52

2 Answers2

3

std::string ain't the cheapest class for memory because of small object optimization. This makes it indeed questionable to use it, though it also has nice API functions and a lot of safety/usability.

Should you worry?

If the code ain't performance critical, don't worry! Gaining a couple of microseconds at the lifetime of your program ain't worth the effort.

Instead, run a profiler over you code and fix the bottlenecks you can find.

Passing std::string by value/construction ref

Assuming you are using an optimizing compiler and pass the flag O2, O3 ... the overhead can be removed by the compiler in a lot of cases. If needed, implement the For in the header. If you pass by value, don't forget the std::move.

Passing std::string_view

In newer standards of the standard library, string_view is available, if not, you can easily copy it from GSL ... This class is similar to a raw char pointer (+ size) and can be used for places where you like to pass strings with needing the std::string

In the end, if you like to store the string, you will still have to convert to std::.string

JVApen
  • 11,008
  • 5
  • 31
  • 67
  • this is the correct answer. stop worrying and use std/boost::string_view. – Richard Hodges May 06 '17 at 10:26
  • @RichardHodges the OP explicitly says he wants to copy the string into a data member. string_view won't help in this instance, as it's a non-owning view on the source string. I don't believe taking a string_view as the only parameter is the correct (in all instances) answer. When taking a parameter in order to store it as a string member, surely taking a string by value and moving it into the member is the optimal solution? – Steve Lorimer May 06 '17 at 20:06
  • StringVies has a toString method which can be used to create a std string – JVApen May 06 '17 at 21:33
  • @SteveLorimer if you look at the number of constructors that will be called, and copies of string data, you'll find that a string view is more efficient. You can't escape copying the string literal, but a string_view is merely a proxy for that – Richard Hodges May 07 '17 at 00:17
2

Note that your question explicitly makes reference to constructor arguments, so this answer refers to those.

Other reasoning applies for assignment operators and setters. (See this question for further details)


If you're going to be storing the string in your object as a data member, then the creation of a string will inevitably have to occur.

As such, the easiest option is to perform the creation of the string as a side-effect of your constructor arguments, and then move it into your data member.

In the following examples, no temporary strings are created and then destroyed. All allocated resources are stored in data_ (by using move)

class Foo
{
public:    
    Foo(std::string data)
        : data_(std::move(data))
    {}
private:
    std::string data_;
};

int main()
{
    const char* foo = "...";
    Foo a(foo);

    std::string bar = "...";
    Foo b(std::move(bar));

    Foo c("...");
}

In terms of your sizeof analysis, note that this size has nothing to do with the dynamic (heap) storage std::string will sometimes allocate to store the character data.

This is just the size of the internal data members, and these are allocated on the stack (automatic storage duration), which is extremely fast.

Stack allocation typically shouldn't be of any concern to you.

Community
  • 1
  • 1
Steve Lorimer
  • 27,059
  • 17
  • 118
  • 213
  • 1
    This was thought to be the preferred technique when C++11 was new, but in the meanwhile, I think the community has discovered that it's actually just premature optimisation. You should just continue to use the good old `std::string const&`, or use the new `std::string_view` if you can use C++17. – Christian Hackl May 06 '17 at 08:56
  • @ChristianHackl even in the event that you're storing the argument in a data member? – Steve Lorimer May 06 '17 at 08:57
  • Yes. See http://stackoverflow.com/questions/26261007/why-is-value-taking-setter-member-functions-not-recommended-in-herb-sutters-cpp for a very lenghty discussion on this whole topic. – Christian Hackl May 06 '17 at 09:01
  • 2
    I wouldn't recommend this. If `Foo::Foo` were to throw, the caller would not be able to recover the value of a `string` that they moved into the constructor. If you need the optimization, it's better to add an overload taking `string&&` in addition to the one taking `string const&`. – Joseph Thomson May 06 '17 at 09:03
  • @ChristianHackl from the accepted answer my take is then *do* use by-value constructors, but for assignment operators and other setters, use const-ref. – Steve Lorimer May 06 '17 at 09:05
  • @JosephThomson the only way you'd be able to destroy the source value is by moving it into the constructor in the first place (`Foo b` in my example). Once you've moved something, accessing it again is UB. If the caller needs to preserve the source value in the face of an exception, then he can't use move. Taking the parameter by value in the constructor changes nothing for the caller – Steve Lorimer May 06 '17 at 09:08
  • @JosephThomson put another way, the constructor takes the string by value - the copy is created there and it is that object on which the move is performed. The caller decides whether or not to move the source string into the constructor – Steve Lorimer May 06 '17 at 09:14
  • My point is that the `string&&` overload can wait until there is no longer any possibility of an exception being thrown before actually performing the move. Thus, the caller's value is left in-tact if the constructor throws. This is not possible when taking by value. – Joseph Thomson May 06 '17 at 09:20
  • @JosephThomson ok I get what you're saying, but I think that's quite dangerous. You're creating a situation where it would be safe to `std::move(input)` in your call to the constructor (so `Foo(string&&)` gets called), and in the event of an exception, `input` would be in a valid and unmoved-from state. I would be highly suspicious of any code which relied on `std::move(data)` leaving `data` in a safe state *after* the move because of some careful machinations in the constructor. If you need your object to be valid in the face of an exception, don't move it. – Steve Lorimer May 06 '17 at 09:31
  • I might not make myself clear. What I wanna get is the emplace-like efficiency. Take the `Foo` in my question. a `vectorOfFoo.emplace_back(42)` creates the Foo in place, makes no temp foo. Instead, if I call `vectorOfFoo.push_back(42)` a temp foo will be created. I did enabled the -O2 option of the compiler. – Need4Steed May 06 '17 at 10:09
  • @Need4Steed your question and sample code are at odds. In your sample code there is no `std::string` parameter in your `Foo` constructor. That said, if you are talking about constructor only (ie: creating a *new* object, not copying into an existing object), and you're *storing* a `std::string` member in your object, then what I have described is most efficicient – Steve Lorimer May 06 '17 at 10:16