C vs C++ performance (for handling short strings)

Question

Edit

My test result is here. Although someone insists on my test is perfectly wrong, C++ was 110% slower than C ;(

Recently, Bjarne Stroustrup have wrote Five Popular Myths about C++

In his article, he implemented a function in C and C++

C++ Version

string compose(const string& name, const string& domain)
{
  return name+'@'+domain;
}

C Version

char* compose(const char* name, const char* domain)
{
  char* res = malloc(strlen(name)+strlen(domain)+2); // space for strings, '@', and 0
  char* p = strcpy(res,name);
p += strlen(name);
  *p = '@';
  strcpy(p+1,domain);
  return res;
}

Finally he mentioned:

which version is likely to be the most efficient? Yes, the C++ version, because it does not have to count the argument characters and does not use the free store (dynamic memory) for short argument strings.

Is this right? although C++ version is shorter than C version I think operator+() of std::string would be similar to C version.

He explained why the C++ version is faster. Yes, it's similar, but differs in the specific ways he mentioned. — David Schwartz, Dec 24 '14 at 02:06
`strlen` is O(n). The C++ standard requires `std::string::size` to be O(1) though. — user123, Dec 24 '14 at 02:10
That bit about small string optimization is implementation specific, it is not mandatory. Otherwise, I don't see what's confusing about Bjarne's explanation. — Praetorian, Dec 24 '14 at 02:11
`name+'@'+domain` creates temporary string objects which I would avoid for fast code. Use `+=` on a local string reserved to the right size instead. — Neil Kirk, Dec 24 '14 at 02:16
@NeilKirk after doing `.reserve` on it with the calculated destination size — M.M, Dec 24 '14 at 02:19

score 4 · Accepted Answer · answered Dec 24 '14 at 02:24

In at least some cases, yes, the C++ version will be substantially faster.

In particular, some implementations of std::string include what's generally referred to as the "short string optimization" (aka "SSO"). With this, the std::string object itself includes space for a string up to some specific limit (typically around 20 characters). Strings that fit in that buffer can (and will) avoid allocating space on the heap/free store to store their data.

In theory, you could do roughly the same thing C--but when/if you do, you have to define your own structure to hold your string (much like C++ does) and every piece of code that manipulates those string structures needs to know how they work, and manipulate them the same way. C++ makes it easy to wrap that code up into an operator overload to keep the details hidden.

The bottom line is that C could theoretically keep up, but it would be enough more work to make that happen that in practice programs that need to do this sort of manipulation in C++ are almost always faster than their counterparts written in C. Just about all that varies is how much faster they run--sometimes they're only a little faster, but especially where there's a lot of manipulation of relatively small strings, substantial differences (e.g., 2:1 or more) are pretty common. Differences can also be quite large when you need to manipulate really large strings, where C++ gains a lot by being able to find the size in constant time, where strlen requires linear time. For strings small enough to fit entirely in L1 cache, that doesn't mean much, but if you can read one value from L1 vs. reading an entire string from main memory, the difference can be huge.

Wagner Patriota · Answer 2 · 2014-12-24T02:18:56.137

3

Yes, the C++ version is faster because it does not allocate anything FOR SMALL STRINGS!

He said:

Yes, the C++ version, because it does not have to count the argument characters and does not use the free store (dynamic memory) for short argument strings.

For small strings, you can take advantage of the stack automatically! Most of the compilers do that today! For bigger strings you will have the "almost" the same result.

But, in fact, he is anyway "promoting" C++... once you can consider the C version to use stack as well via byte arrays.

edited Dec 24 '14 at 02:18

answered Dec 24 '14 at 02:11

Wagner Patriota

5,494
26
49

2

For bigger strings the C++ version will be much worse as it has up to 4 allocations, whereas the C version has 1. Also not mentioned is overhead in constructing temporary strings with which to call this function. – M.M Dec 24 '14 at 02:16
You are right about the allocations, but I am not sure about the temporary... Today's compiler will only create 1 temporary for this case! But I will update the answer a little bit with your contribution. Thanks. – Wagner Patriota Dec 24 '14 at 02:18
2

@MattMcNabb Four allocations? I can only see up to two: `name+'@'` will at most perform one allocation, and the remaining `+domain` might perform another. There are rvalue-ref overloads of `operator+`, and the result will be moved into the return value. – dyp Dec 24 '14 at 02:30
@dyp The second call to `operator+` will nevertheless do a new allocation in most cases because the first allocation in `name + '@'` does not know how much space the second concatenation will need. It may overallocate just in case, but if `domain` is longer than `name`, I would guess that all implementation will do a second allocation. – cmaster - reinstate monica Dec 25 '14 at 02:42

score 1 · Answer 3 · answered Dec 25 '14 at 02:36

Even though the C++ version may be faster for short and very long strings, the C version is faster for medium length strings that require a dynamic memory allocation in C++:

In the C version, there is only one allocation for the resulting string. The C++ version needs to allocate two buffers: One for the result of name + @, and another one for the result of name + @ + domain. This alone gives C++ a handicap of more than 250 CPU cycles (at least on my system).
While it is correct that C++ does not need to scan the input string twice, it nevertheless copies the string name twice: Once in the calculation of name + @, and once in the calculation of name + @ + domain. It would require special treatment of string concatenations in the compiler (not in the standard library implementation) to avoid this pitfall.
The C version touches less memory. This allows the CPU to utilize its caches better.

For the C++ version to be faster than the C version, you need domain strings that are at least in the order of a hundred characters or so, or you need very short strings plus an implementation of std::string that actually implements short string optimization.

And if you have more than just two concatenations in your function, C++ will likely be slower even on very long strings, because the first strings will be copied several times.

Basically, you can say that in C the order of concatenation is O(N) where N is the length of the resulting string, a figure that is independent of the amount of input strings. In C++, by contrast, the order of concatenation is O(n*m^2) where n is the length of a single string and m is the count of concatenations.

score 0 · Answer 4 · 2018-01-15T07:33:48.617

Just generally speaking, if you showed me a series of reasonably well-written C programs and a series of reasonably well-written C++ programs and asked me which one had more efficient string operations, my bet would be on the C programs.

And this is coming from a C++ enthusiast. But there are a lot of tendencies in C++ programs for people to use a lot more memory and/or incur a whole lot more heap allocations than necessary when working with strings, and that tends to more than outweigh the extra cache-friendly sequential passes that your C program might do with some extra calls to strlen when it could have kept the size of the string around.

As a basic example, a C++ developer wanting to store a large number of strings for random access might do this:

std::vector<std::string> boatload_of_strings;

... and this either incurs far more heap allocations than necessary or uses a boatload more memory than necessary. With most modern compilers implementing small strings optimizations, even storing an entry for "a" or "I" could end up storing 24 bytes just for that one-character string when you only need 2 bytes. Meanwhile the C programmer lacking such conveniences might store this:

// One contiguous buffer for all strings, realloced when
// capacity is exceeded.
char* boatload_of_strings;

// Starting position of each string.
int* string_start;

... with the nth null-terminated string accessed like this:

const char* nth_string = boatload_of_strings + string_start[n];

And that's so much more efficient: more cache-friendly, less memory used, etc. Of course it takes more time to write and is more error-prone, and if the question is about productivity working with strings rather than computational/memory efficiency, I'd quickly change my vote to C++. Of course a C++ developer could also do:

// One contiguous buffer for all strings.
vector<char> boatload_of_strings;

// Starting position of each string.
vector<int> string_start;

... and that's a very efficient way to represent a boatload of strings that need to be accessed randomly. But this subject is about tendencies, and I think most C++ developers are more likely to use std::string here than a vector of char. All we can talk about is tendencies because a C programmer can also store a string length in a struct. A C programmer could also do something very inefficient and store char** boatload_of_strings and allocate every single teeny string separately. We're just talking about what people tend to do, and given what I've seen people tend to do in these two languages, my bet is on C programs to generally have an efficiency edge with strings.

There is a good point that the lack of C strings keeping track of length could lead to more linear passes than necessary, but again those are cache-friendly linear passes through a contiguous buffer. It'd be somewhat like arguing that a linked list which allocates every single node separately against a general-purpose allocator is more efficient for push_backs than std::vector because it doesn't have to reallocate and copy a buffer in linear-time. Memory efficiency trumps some extra linear passes here and there, and std::string is never going to be perfectly optimal from a memory efficiency standpoint, especially when used to store persistent data. It either uses too much memory for really small strings with the small string optimization, or uses too many heap allocations for medium-sized strings since the small string optimization favored a really tiny buffer.

There might be some genuine cases where C++ does hold an edge in practical use cases, but most of the times I have gotten major performance boosts replacing C++ code using std::string with plain old character buffers, not the other way around as you have also found. It'd be rare for me to encounter a genuine case, measured and profiled accordingly before and after, where replacing competently-written C code using character buffers results in a performance boost after using, say, std::string or std::wstring.

strlen

Another crucial thing to keep in mind is that strlen is often implemented in a very efficient way. For example, MSVC treats it as a compiler intrinsic along with functions like memset. They don't treat these as regular function calls and instead generate very efficient instructions for these of a kind that is far more efficient than if you just hand-rolled a basic loop counting the number of characters in a string before reaching a null terminator.

So not only is it a cache-friendly sequential loop through a contiguous buffer, but one that has been optimized to death. I have never, ever seen the use of strlen show up as a hotspot in any profiling session on any codebase whatsoever. I have definitely seen my share of std::string and QString-related hotspots in VTune.

[...] does not use the free store (dynamic memory) for short argument strings.

I don't know what type of C programs Bjarne was looking at but typically most C programs I see don't utilize heap allocations for small strings. They often use buffers on the stack like this:

char buf[256];

... which isn't very robust but definitely not going to be invoking heap allocations, or using VLAs with C99 like this:

char buf[n];

... which risks stack overflows but, again, isn't invoking needless heap allocations, or something like this:

char buf[256];
char* data = (n < 256) ? buf: malloc(n+1);
...
if (data != buf)
     free(data);

... which is the most robust and still avoids the heap allocation in the common cases. Besides that, people have been touting that std::string is faster than your average C code for ages, long before small string optimizations, and even back when most std::string implementations used copy-on-write. And the real world results never quite matched up to those claims in this one case.

Compose Example

Okay so phew, getting to the compose example:

char* compose(const char* name, const char* domain)
{
  char* res = malloc(strlen(name)+strlen(domain)+2); // space for strings, '@', and 0
  char* p = strcpy(res,name);
  p += strlen(name);
  *p = '@';
  strcpy(p+1,domain);
  return res;
}

First of all, I don't encounter people writing C code like this so often with a function that heap-allocates a string and returns a pointer to it for the client to free in production code. Far more often I'm seeing people do things like:

char buf[512];
sprintf(buf, "%s@%s", name, domain);

... which again isn't the safest code, but definitely not one that incurs more heap-allocations than the other and it doesn't have to do an extra pass to determine the length of these strings since the buffer is already pre-sized. But if we dissect the C++ version:

string compose(const string& name, const string& domain)
{
  return name+'@'+domain;
}

string::operator+ can potentially get away with one less linear pass through these two strings because they store a size, but if these strings are teeny, that's such a trivial overhead. It's saving pennies but in exchange for a cost. If these strings aren't teeny, then the small string optimization doesn't help, it actually hurts and causes more memory waste, and you'd still end up with a heap allocation. The above solution is way more robust than the sprintf solution using a fixed-sized buffer, but here I'm just talking about efficiency against common tendencies.

Just very generally speaking, doing an additional linear passe through contiguous data to determine size is often cheaper than alternatives that could potentially require more/bigger heap allocations. For example, if you do this:

int count = 0;

// count how many elements there are:
for (...)
{
    ...
    ++count;
}

// Now size the vector accordingly:
vector<int> values(count);

// Do a second pass through the same data.
for (...)
    ...

... this often tends to be more efficient than:

vector<int> values;

// Do a single pass through the data with push_backs.
for (...)
    ...

And a similar principle applies to strings. More linear passes through a string isn't necessarily more expensive if it results in less memory use, less heap allocations, etc.

C vs C++ performance (for handling short strings)

4 Answers4

Linked