4

Consider first that the amount of total data that will be stored in the output string will almost certainly be small and so I doubt any of these have a noticeable affect on performance. My primary goal is to find a way to concatenate a series of const char*'s of unknown size that doesn't look terrible while also keeping efficiency in mind. Below are the results of my search:

Method 1:

std::string str = std::string(array1) + array2 + array3;

Method 2:

std::string str(array1);
str += array2;
str += array3;

I decided on the first method as it is short and concise. If I'm not mistaken, both methods will invoke the same series of operations. the unoptimized compiler would first create a temporary string and internally allocate some amount of space for its buffer >= sizeof(array1). If that buffer is sufficiently large, the additional + operations will not require any new allocations. Finally, if move semantics are supported, then the buffers of the temporary and named str are swapped.

Are there any other ways to perform such an operation that also look nice and don't incur terrible overhead?

vmrob
  • 2,966
  • 29
  • 40
  • 1
    I *think* the `+=` approach doesn't require a temporary, while the `+` approach requires an anonymous temporary `string`. In practice, with RVO and the optimizations `string` makes under the hood to share storage, I doubt you'd notice a meaningful difference. – Joe Z Dec 18 '13 at 03:30
  • Ok, I measured the two approaches with G++ 4.8 on my Linux box, and the `+` approach is about 10% slower. It's 10% slower on top of being very fast otherwise. In other words, I wouldn't worry about it. – Joe Z Dec 18 '13 at 03:37
  • The most efficient would probably be to loop through the char arrays, figure out individual lengths, then construct a `string` with initial size set to `total_length + 1`. Next, use the `string::append` overload that takes a `char *` and length to concat the strings together. – Praetorian Dec 18 '13 at 03:38
  • You might also consider [reserving](http://en.cppreference.com/w/cpp/string/basic_string/reserve) a reasonable amount of memory to reduce the number of reallocations and memory copying. – Cornstalks Dec 18 '13 at 03:40

1 Answers1

1

Remember, that, in case of arrays, sizeof(array) returns actual size (aka length) of it's parameter, if it has been declared as an array of explicit size (and you wrote 'series of const char*'s of unknown size'). So, assuming you want to create universal solution, strlen() should come under consideration instead.

I don't think you can avoid all additional operations. In case of many concatenations, the best solution would be to allocate buffer, that is large enough to store all concatenated strings.

We can easily deduce, that the most optimal version of append() in this case is:

string& append (const char* s, size_t n);

Why? Because reference says: 'If s does not point to an array long enough (...), it causes undefined behavior'. So we can assume, that internally no additional checks take place (especially additional strlen() calls). Which is good, since you are completely sure, that values passed to append() are correct and you can avoid unnecesary overhead.

Now, the actual concatenation can be done like this:

len_1 = strlen(array_1);
len_2 = strlen(array_2);
len_3 = strlen(array_3);

//Preallocate enough space for all arrays. Only one reallocation takes place.
target_string.reserve(len_1 + len_2 + len_3);

target_string.append(array_1, len_1);
target_string.append(array_2, len_2);
target_string.append(array_3, len_3);

I do not know if this solution 'looks good' in your opinion, but it's definitely clear and is optimized for this use case.

Mateusz Grzejek
  • 11,698
  • 3
  • 32
  • 49
  • It looks like the link others provided agrees with this assessment. I'm actually pretty surprised to find how inefficient operator+ is. I think to help me in my question to write "pretty" code, I'll have to encapsulate that solution. – vmrob Dec 18 '13 at 07:38