Should I return std::strings?

Question

I'm trying to use std::string instead of char* whenever possible, but I worry I may be degrading performance too much. Is this a good way of returning strings (no error checking for brevity)?

std::string linux_settings_provider::get_home_folder() {
    return std::string(getenv("HOME"));
}

Also, a related question: when accepting strings as parameters, should I receive them as const std::string& or const char*?

Thanks.

Nitpick: getenv() can return NULL if the variable does not exist, which will cause the std::string constructor to throw an exception. — Tom, Jun 23 '09 at 12:41
Thanks. The production code does check for NULLs, but I omitted it for clarity. — Pedro d'Aquino, Jun 23 '09 at 12:45

score 64 · Accepted Answer · answered Jun 23 '09 at 12:30

64

Return the string.

I think the better abstraction is worth it. Until you can measure a meaningful performance difference, I'd argue that it's a micro-optimization that only exists in your imagination.

It took many years to get a good string abstraction into C++. I don't believe that Bjarne Stroustroup, so famous for his conservative "only pay for what you use" dictum, would have permitted an obvious performance killer into the language. Higher abstraction is good.

answered Jun 23 '09 at 12:30

duffymo

305,152
44
369
561

Thanks. I was a bit afraid it was considered bad practice, but I'm glad to see it isn't :-) – Pedro d'Aquino Jun 23 '09 at 12:39
3

remember you can always use references where appropriate too to avoid unneeded copies. i try to have input parameters as "const std::string&" where possible – ShoeLace Jun 23 '09 at 13:37
4

"It took many years to get a good string abstraction into C++." IMHO it still sucks. – Johan Kotlinski Jun 23 '09 at 13:51
How so? Still an improvement over char *. – duffymo Jun 23 '09 at 14:30
12

I don't think that allowing the perfect to be the enemy of the good is a wise strategy. Waiting for perfect software isn't the answer. – duffymo Jun 23 '09 at 14:31
I sometime accept a const char* as a parameter but return std::string. It's a preference for me, and seems to make the code a little easier to work with, though there's no reason I can think of why you couldn't wrap your const char* in a std::string and pass it as a parameter. I guess it's up to you on how you want to do it (though returning a const char* is generally not wise IMO). – David Peterson Dec 25 '12 at 05:42
Isn't it better for member functions or static variables to return a (potentially const) reference to the `std::string` instead? – KeyC0de Apr 06 '22 at 20:31
A const std::string would be a good idea. 13 years later - an improvement. – duffymo Apr 06 '22 at 21:21

Steve Jessop · Answer 2 · 2012-11-16T09:26:51.470

Return the string, like everyone says.

when accepting strings as parameters, should I receive them as const std::string& or const char*?

I'd say take any const parameters by reference, unless either they're lightweight enough to take by value, or in those rare cases where you need a null pointer to be a valid input meaning "none of the above". This policy isn't specific to strings.

Non-const reference parameters are debatable, because from the calling code (without a good IDE), you can't immediately see whether they're passed by value or by reference, and the difference is important. So the code may be unclear. For const params, that doesn't apply. People reading the calling code can usually just assume that it's not their problem, so they'll only occasionally need to check the signature.

In the case where you're going to take a copy of the argument in the function, your general policy should be to take the argument by value. Then you already have a copy you can use, and if you would have copied it into some specific location (like a data member) then you can move it (in C++11) or swap it (in C++03) to get it there. This gives the compiler the best opportunity to optimize cases where the caller passes a temporary object.

For string in particular, this covers the case where your function takes a std::string by value, and the caller specifies as the argument expression a string literal or a char* pointing to a nul-terminated string. If you took a const std::string& and copied it in the function, that would result in the construction of two strings.

score 12 · Answer 3 · answered Jun 23 '09 at 13:46

12

The cost of copying strings by value varies based on the STL implementation you're working with:

std::string under MSVC uses the short string optimisation, so that short strings (< 16 characters iirc) don't require any memory allocation (they're stored within the std::string itself), while longer ones require a heap allocation every time the string is copied.
std::string under GCC uses a reference counted implementation: when constructing a std::string from a char*, a heap allocation is done every time, but when passing by value to a function, a reference count is simply incremented, avoiding the memory allocation.

In general, you're better off just forgetting about the above and returning std::strings by value, unless you're doing it thousands of times a second.

re: parameter passing, keep in mind that there's a cost from going from char*->std::string, but not from going from std::string->char*. In general, this means you're better off accepting a const reference to a std::string. However, the best justification for accepting a const std::string& as an argument is that then the callee doesn't have to have extra code for checking vs. null.

answered Jun 23 '09 at 13:46

jskinner

1,041
8
7

1

Wouldn't this mean I'm better off accepting a const char*? If my client has a std::string he can c_str() it, which, as you said, doesn't cost much. On the other hand, if he has a char*, he's forced to build a std::string. – Pedro d'Aquino Jun 23 '09 at 14:06
1

Brian: GCC most certainly does use a reference counted string implementation, have a read of /usr/include/c++/4.3/bits/basic_string.h, for example. – jskinner Jun 23 '09 at 22:32
Pedro: If you're writing a function that only needs a const char*, then yes, you're clearly better off accepting a const char*. If the function needs it as a std::string, then it's better off as that. My comment was more in relation to the cases where you don't know which you need (e.g., when writing an interface class). – jskinner Jun 23 '09 at 22:35
@Brian - RTFCode, it's plain as day. GCC still uses reference-counting. – Tom Jun 24 '09 at 01:28
Wow, I was totally wrong. Sorry about that. I recall reading an in-depth article about the failures of reference counted strings, and how that it is actually more efficient to go with a non-referenced counted solution. I must have dreamed it all. – Brian Neal Jun 24 '09 at 21:33
It looks like the implementation of std::string in gcc changed in version 5, so that it no longer uses reference counting and uses the short string optimization like MSVC. Search for std::string in https://gcc.gnu.org/gcc-5/changes.html. – Andrew Bainbridge Jul 26 '15 at 18:48

score 10 · Answer 4 · edited Jun 23 '09 at 12:35

10

Seems like a good idea.

If this is not part of a realtime software (like a game) but a regular application, you should be more than fine.

Remember, "Premature optimization is the root of all evil"

edited Jun 23 '09 at 12:35

plinth

48,267
11
78
120

answered Jun 23 '09 at 12:31

kostia

6,161
3
19
23

AareP · Answer 5 · 2009-06-23T18:41:40.077

It's human nature to worry about performance especially when programming language supports low-level optimization. What we shouldn't forget as programmers though is that program performance is just one thing among many that we can optimize and admire. In addition to program speed we can find beauty in our own performance. We can minimize our efforts while trying to achieve maximum visual output and user-interface interactiveness. Do you think that could be more motivation that worrying about bits and cycles in a long run... So yes, return string:s. They minimize your code size, and your efforts, and make the amount of work you put in less depressing.

Kirill V. Lyadvinsky · Answer 6 · 2009-06-23T12:51:49.640

5

In your case Return Value Optimization will take place so std::string will not be copied.

edited Jun 23 '09 at 12:51

answered Jun 23 '09 at 12:35

Kirill V. Lyadvinsky

97,037
24
136
212

1

That's not true. std::string is going to dynamically allocate a buffer and copy the entire string, and return value optimization will not do a lick here. However, he should still use std::string. After checking that getenv() didn't return NULL, that is! – Tom Jun 23 '09 at 12:40
1

One allocation will be really. I mean, that would not be copied string itself. – Kirill V. Lyadvinsky Jun 23 '09 at 12:47
3

+1: You're correct. Without the RVO, it would have to allocate two buffers and copy between them. – James Hopkin Jun 23 '09 at 14:52

score 4 · Answer 7 · answered Jun 23 '09 at 13:14

4

Beware when you cross module boundaries.

Then it's best to return primitive types since C++ types are not necessarily binary compatible across even different versions of the same compiler.

answered Jun 23 '09 at 13:14

Hans Malherbe

2,988
24
19

7

You need to do much more than just avoid C++ return types for that... you need to completely pimplize *all* C++ code to really be safe, at which point you're going to be creating a C wrapper on top of your existing codebase anyways, due to the nature of class declarations. – Tom Jun 24 '09 at 01:22

score 3 · Answer 8 · answered Jun 23 '09 at 13:16

I agree with the other posters, that you should use string.

But know, that depending on how aggressively your compiler optimizes temporaries, you will probably have some extra overhead (over using a dynamic array of chars). (Note: The good news is that in C++0a, the judicious use of rvalue references will not require compiler optimizations to buy efficiency here - and programmers will be able to make some additional performance guarantees about their code without relying on the quality of the compiler.)

In your situation, is the extra overhead worth introducing manual memory management? Most reasonable programmers would disagree - but if your application does end up having performance issues, the next step would be to profile your application - thus, if you do introduce complexity, you only do it once you have good evidence that it is needed to improve overall efficiency.

Someone mentioned that Return Value optimization (RVO) is irrelevant here - I disagree.

The standard text (C++03) on this reads (12.2):

[Begin Standard Quote]

Temporaries of class type are created in various contexts: binding an rvalue to a reference (8.5.3), returning an rvalue (6.6.3), a conversion that creates an rvalue (4.1, 5.2.9, 5.2.11, 5.4), throwing an exception (15.1), entering a handler (15.3), and in some initializations (8.5). [Note: the lifetime of exception objects is described in 15.1. ] Even when the creation of the temporary object is avoided (12.8), all the semantic restrictions must be respected as if the temporary object was created. [Example: even if the copy constructor is not called, all the semantic restrictions, such as accessibility (clause 11), shall be satisfied. ]

 [Example:  
struct X {
  X(int);
  X(const X&);
  ˜X();
};

X f(X);

void g()
{
  X a(1);
  X b = f(X(2));
  a = f(a);
}

Here, an implementation might use a temporary in which to construct X(2) before passing it to f() using X’s copy-constructor; alternatively, X(2) might be constructed in the space used to hold the argument. Also, a temporary might be used to hold the result of f(X(2)) before copying it to b using X’s copyconstructor; alternatively, f()’s result might be constructed in b. On the other hand, the expression a=f(a) requires a temporary for either the argument a or the result of f(a) to avoid undesired aliasing of a. ]

[End Standard Quote]

Essentially, the text above says that you can possibly rely on RVO in initialization situations, but not in assignment situations. The reason is, when you are initializing an object, there is no way that what you are initializing it with could ever be aliased to the object itself (which is why you never do a self check in a copy constructor), but when you do an assignment, it could.

There is nothing about your code, that inherently prohibits RVO - but read your compiler documentation to ensure that you can truly rely on it, if you do indeed need it.

score 1 · Answer 9 · answered Jun 23 '09 at 12:33

I agree with duffymo. You should make an understandable working application first and then, if there is a need, attack optimization. It is at this point that you will have an idea where the major bottlenecks are and will be able to more efficiently manage your time in making a faster app.

score 1 · Answer 10 · answered Jun 23 '09 at 12:34

1

I agree with @duffymo. Don't optimize until you have measured, this holds double true when doing micro-optimizations. And always: measure before and after you've optimized, to see if you actually changed things to the better.

answered Jun 23 '09 at 12:34

JesperE

63,317
21
138
197

score 1 · Answer 11 · answered Jun 23 '09 at 12:39

1

Return the string, it's not that big of a loss in term of performance but it will surely ease your job afterward.

Plus, you could always inline the function but most optimizer will fix it anyways.

answered Jun 23 '09 at 12:39

Gab Royer

9,587
8
40
58

score 1 · Answer 12 · answered Jun 23 '09 at 17:14

1

If you pass a referenced string and you work on that string you don't need to return anything. ;)

answered Jun 23 '09 at 17:14

Partial

9,529
12
42
57

Should I return std::strings?

12 Answers12

Linked

Related