59

I came across these two methods to concatenate strings:

Common part:

char* first= "First";
char* second = "Second";
char* both = malloc(strlen(first) + strlen(second) + 2);

Method 1:

strcpy(both, first);
strcat(both, " ");       // or space could have been part of one of the strings
strcat(both, second);

Method 2:

sprintf(both, "%s %s", first, second);

In both cases the content of both would be "First Second".

I would like to know which one is more efficient (I have to perform several concatenation operations), or if you know a better way to do it.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Xandy
  • 1,369
  • 1
  • 12
  • 12
  • 6
    As Michalis Giannakidis points out - there's a buffer overflow here; you need to allocate lengths plus **two** to allow for the space and the terminal null. – Jonathan Leffler Sep 05 '09 at 16:21
  • 2
    From a performance POV, the things to know are that strcat has to scan all the way along the string to find the end before it can append anything, and that sprintf has to parse the format string. Beyond that, if you want to know which is faster for your particular strings, you have to measure it. – Steve Jessop Sep 05 '09 at 16:24
  • 1
    I guess you could also consider that sprintf is a much bigger function than the simple string manipulators, so will likely evict more code from your icache, and hence is more likely to slow down some other, totally unrelated part of your program. But that kind of effect is beyond the point where you can expect to predict performance in advance – Steve Jessop Sep 05 '09 at 16:28
  • Thanks for the buffer overflow info here, I'll edit it now. Thanks for the comments too, very appreciated. – Xandy Sep 05 '09 at 16:38
  • If you have to do a lot of string concatenating, it might be worth using explicit-length strings instead of null-terminated strings. (`std::string` knows its own length, but it might not optimize as well for compile-time-constant string literals) – Peter Cordes Dec 05 '17 at 05:06

10 Answers10

74

For readability, I'd go with

char * s = malloc(snprintf(NULL, 0, "%s %s", first, second) + 1);
sprintf(s, "%s %s", first, second);

If your platform supports GNU extensions, you could also use asprintf():

char * s = NULL;
asprintf(&s, "%s %s", first, second);

If you're stuck with the MS C Runtime, you have to use _scprintf() to determine the length of the resulting string:

char * s = malloc(_scprintf("%s %s", first, second) + 1);
sprintf(s, "%s %s", first, second);

The following will most likely be the fastest solution:

size_t len1 = strlen(first);
size_t len2 = strlen(second);

char * s = malloc(len1 + len2 + 2);
memcpy(s, first, len1);
s[len1] = ' ';
memcpy(s + len1 + 1, second, len2 + 1); // includes terminating null
Christoph
  • 164,997
  • 36
  • 182
  • 240
  • 18
    I'd just like to put in a vote of disagreement for your first solution being readable. It's more compact, but is it more readable? I don't think so. I didn't downvote, though. – Imagist Sep 05 '09 at 16:20
  • 2
    It would perhaps be worth mentioning `asprintf()` which does the memory allocation for you: `char *s; int len = asprintf(&s, "%s %s", first, second);` without any fuss or muss. – Jonathan Leffler Sep 05 '09 at 16:27
  • 1
    @Jonathan: `asprintf()` isn't part of the C stdlib and the MS-compiler dosn't support it – Christoph Sep 05 '09 at 16:33
  • @PS: on second thought, you might have to use `_scprintf()` to determine the size of the buffer anyway as I'm not sure if `_snprintf()` from the MS CRT behaves as required by the standard when the size argument is `0` – Christoph Sep 05 '09 at 16:41
  • 1
    @Christoph: yes, I know asprintf() is not standard; that's why I suggested mention it rather than proposing it as 'the answer'. Perhaps I should have put in the relevant caveats in my original comment, though. (Man page at: http://linux.die.net/man/3/asprintf, amongst other places.) – Jonathan Leffler Sep 05 '09 at 17:00
  • 1
    For shorter strings memory allocation will be the main bottleneck. Also, discussion of different XXprintf functions is irrelevant, because this method is obviously slowest. – noop Feb 07 '12 at 19:40
  • The string "snprintf(NULL, 0, "%s %s", first, second)" is repeated, isn't that against DRY (Don't Repeat Yourself)? – Arun Mar 14 '13 at 22:19
  • Your solution is far away from fastest. Doing 2 strlen just to get the length of the strings is 2 unnecessary string traversals. Instead copy and test for string ending in one go. Try to copy in 4 (or 8 bytes in 64bits) chunks as possible while testing for the end character. – Manuel Astudillo Oct 15 '20 at 08:33
23

Don't worry about efficiency: make your code readable and maintainable. I doubt the difference between these methods is going to matter in your program.

Ned Batchelder
  • 364,293
  • 75
  • 561
  • 662
  • 6
    I am with Ned. It seems like you are performing premature optimisation. Like girls, it is also the root of all evil (it has multiple roots). Get your program running, then profile it, then optimise. Until then you are just waiting time IMHO. – freespace Sep 05 '09 at 16:05
  • 1
    Fair enough, thanks for your comments. I think that doing some things in a better manner from the beginning can save time later on, and this one may be one. Thanks once more. – Xandy Sep 05 '09 at 16:24
  • @Xandy: you're right that doing things better to start with is good, but in practice faster code usually isn't "better" to debug and maintain. I always end up spending way more time debugging and refactoring my code, and adding more features, than I spend speeding it up. So personally I'd either use s(n)printf (or even a string library that handles allocation) to optimise simplicity of the source code, or if optimising for speed I'd do an absolute bare minimum of memory access (and hence not use strcat when I already have the lengths). – Steve Jessop Sep 05 '09 at 16:36
  • 19
    @Ned: That doesn't answer the question! He asked which way is more efficient, not if he should worry about efficiency or not. – Wadih M. Sep 05 '09 at 16:42
  • 1
    @Wadih: it's an answer that points out an unexamined assumption and thus makes the OP's question irrelevant. It's about the higher goal of efficiency in development. Both make it an appropriate answer. http://catb.org/~esr/faqs/smart-questions.html#goal – outis Sep 05 '09 at 19:02
  • 1
    @outis: A concatenation extensive tasks (string analysis, text manipulation, text generators) must take into account the performance of the different possible implementations, and this is a totally valid and smart question. Educating onself on the different approaches possible to resolve a problem before beginning implementation is the smartest thing to do. Asthetics and maintainability are only TWO factors to consider. Performance is another and there are plenty of cases where the latter is the dominant one. – Wadih M. Sep 06 '09 at 03:07
  • Bottom line - nobody needs the permission of StackOverflow to optimise their code, but when StackOverflowers say "optimising this will have no measurable effect on the performance of your program, and will make your code harder to maintain", they're right more often than not. – Steve Jessop Sep 06 '09 at 14:28
  • I agree, I tend to write readable code yet efficient, but this time efficiency when handling strings may be determinative. Using memcpy() (over strcat()/strcpy() and sprintf()) seems to be the fastest way across several implementations but it's not as easy to maintain as other methods may be. Anyway a profiler will give me more info when the time comes. Thank you for all your outstanding support and time, it's much appreciated. – Xandy Sep 07 '09 at 21:29
  • 2
    Usage of such programming language actually means that you DO care about efficiency. If you don't, why use unsafe feature-limited language with manual memory management? Also, profiling is overrated. Either you do understand your goals and can predict possible performance bottlenecks, or you don't have a clue, even with help of a profiler. – noop Feb 07 '12 at 19:31
  • 2
    I agree that it might be a case of premature optimization but it is important to recognize (as the OP did) that it might eventually turn up to be a case for optimization. If, in case, it turns out to be bottleneck and such string concatenations are done all over the program, then it will be a problem. To mitigate that risk AND of-course for better readability, I would factor this into a function, say strConstructConcat(), and put either Method 1 or Method 2 into it and be done it with until profiling shows it to be a bottleneck. – Arun Mar 14 '13 at 22:17
  • 4
    -1 does not answer the question ; also, from the question you can't be able to determine if the optimization is premature or not. +1 for @Arun there for actually proposing factoring it out into a function for more flexibility (which is something which actually could help the OP) – griffin Jul 31 '13 at 14:23
19

Here's some madness for you, I actually went and measured it. Bloody hell, imagine that. I think I got some meaningful results.

I used a dual core P4, running Windows, using mingw gcc 4.4, building with "gcc foo.c -o foo.exe -std=c99 -Wall -O2".

I tested method 1 and method 2 from the original post. Initially kept the malloc outside the benchmark loop. Method 1 was 48 times faster than method 2. Bizarrely, removing -O2 from the build command made the resulting exe 30% faster (haven't investigated why yet).

Then I added a malloc and free inside the loop. That slowed down method 1 by a factor of 4.4. Method 2 slowed down by a factor of 1.1.

So, malloc + strlen + free DO NOT dominate the profile enough to make avoiding sprintf worth while.

Here's the code I used (apart from the loops were implemented with < instead of != but that broke the HTML rendering of this post):

void a(char *first, char *second, char *both)
{
    for (int i = 0; i != 1000000 * 48; i++)
    {
        strcpy(both, first);
        strcat(both, " ");
        strcat(both, second);
    }
}

void b(char *first, char *second, char *both)
{
    for (int i = 0; i != 1000000 * 1; i++)
        sprintf(both, "%s %s", first, second);
}

int main(void)
{
    char* first= "First";
    char* second = "Second";
    char* both = (char*) malloc((strlen(first) + strlen(second) + 2) * sizeof(char));

    // Takes 3.7 sec with optimisations, 2.7 sec WITHOUT optimisations!
    a(first, second, both);

    // Takes 3.7 sec with or without optimisations
    //b(first, second, both);

    return 0;
}
bk1e
  • 23,871
  • 6
  • 54
  • 65
Andrew Bainbridge
  • 4,651
  • 3
  • 35
  • 50
  • Thanks for the benchmarking! It's really appreciated! Regarding the time spent with and without optimizations in the first case, -O2 may perform some optimizations which result in slower code in favour of smaller code (http://www.linuxjournal.com/article/7269). Thanks for your answer and time. – Xandy Sep 05 '09 at 20:31
  • 1
    Having just looked at the generated instructions, the -O2 code is bigger as well as slower! The problem looks to be that gcc is using the "repne scasb" instruction to find the length of the string. I suspect that that instruction is very slow on modern hardware. I'm going to find a gcc expert to ask about this. – Andrew Bainbridge Sep 05 '09 at 23:02
  • 1
    @Andrew Bainbridge, a little bit OT, but you can use < and > for < and > – quinmars Sep 07 '09 at 14:59
  • 1
    @Andrew Bainbridge: You can also indent by 4 spaces to format as code. Then you don't have to escape < and > and you also get syntax highlighting. – bk1e Sep 07 '09 at 20:18
  • 3
    Try using `-march=generic`. mingw defaults to i586 which is really really old, outdated and makes assumptions that will fit – LiraNuna Sep 07 '09 at 20:22
  • @LiraNuna: My gcc claims that -march=generic doesn't work but -mtune=generic does, and indeed fixes the problem (the repne is replaced with call _strlen). The resulting code is faster than the unoptimised! I had previously tried -march=pentium4, which didn't help. – Andrew Bainbridge Sep 07 '09 at 21:25
  • Gcc no longer inlines `repne scasb` for `strlen`. You are correct that was a bad choice for performance. Calling a library function that can use SSE2 or AVX2 is much better, especially for medium to long strings. It will sometimes still inline `repe cmpsb` for comparing against string literals, which is maybe ok when they're short vs. the overhead of a function call. – Peter Cordes Dec 05 '17 at 04:18
6
size_t lf = strlen(first);
size_t ls = strlen(second);

char *both = (char*) malloc((lf + ls + 2) * sizeof(char));

strcpy(both, first);

both[lf] = ' ';
strcpy(&both[lf+1], second);
  • 1
    That strcat should be a second strcpy - this is undefined behavior as written. – Steve Jessop Sep 05 '09 at 16:18
  • 2
    In fact, one could use memcpy, since the length are already calculated :) – Filip Navara Sep 05 '09 at 16:22
  • But, as @onebyone points out, the strcat() is not OK this time, because the strcat() starts tracking after the space, and you don't know what characters are in the string at that point. – Jonathan Leffler Sep 05 '09 at 16:23
  • @Filip: actually, it's plausible that strcpy could be faster than memcpy. To use memcpy, you need to keep ls hanging around, which means using more registers, which could perhaps cost you a extra stack store before the call to malloc. The naive implementations of memcpy and strcpy have very similar inner loops, just mempcy decrements a length and checks 0, whereas strcpy compares the byte copied against 0. So it's all down to how ferociously optimised those two functions are in your implementation, which you'd have to investigate on a case-by-case basis :-) – Steve Jessop Sep 05 '09 at 16:59
  • 3
    @onebyone: optimized versions of `memcpy()` will copy multiple bytes per iteration step; `strcpy()` may also do this, but it still has to examine every single byte to check for the terminating 0; therefore I'd expect `memcpy()` to be faster – Christoph Sep 05 '09 at 17:39
  • Sure, I'd expect memcpy to be faster too for long strings. I just find it entertaining to speculate what might cause the counter-intuitive thing to happen. For instance, on some architectures (including MMX, I think) there are packed comparison ops, so strcpy could check all the bytes in a copied block in one instruction... – Steve Jessop Sep 06 '09 at 14:25
  • @SteveJessop: For the record, `memcpy` is better. SIMD-optimized `strcpy` is quite good, but it needs to use aligned loads to [avoid reading from the next page (and segfaulting)](https://stackoverflow.com/questions/37800739/is-it-safe-to-read-past-the-end-of-a-buffer-within-the-same-page-on-x86-and-x64) if the terminating `0` is in the middle of a SIMD vector. Knowing the length up-front makes the startup code in a `memcpy` library function more efficient, because if the block is long it can jump right into vector copying with an aligned *store* address instead of aligned loads. – Peter Cordes Dec 05 '17 at 04:30
  • (Things get tricky when the src and dst are misaligned relative to each other, so you have to choose between aligned loads or aligned stores). And anyway, SIMD packed-compare against `0` is more expensive than comparing against an end-pointer with integer ops. I think a modern x86 can still manage one vector store per clock for a well-written `strcpy` loop, but it would take more front-end bandwidth to issue those instructions. Also, the loop-exit condition is ready sooner for memcpy (out-of-order execution of the loop overhead), so branch mispredict of the loop exit resolves faster. – Peter Cordes Dec 05 '17 at 04:34
2

They should be pretty much the same. The difference isn't going to matter. I would go with sprintf since it requires less code.

Jay Conrod
  • 28,943
  • 19
  • 98
  • 110
2

The difference is unlikely to matter:

  • If your strings are small, the malloc will drown out the string concatenations.
  • If your strings are large, the time spent copying the data will drown out the differences between strcat / sprintf.

As other posters have mentioned, this is a premature optimization. Concentrate on algorithm design, and only come back to this if profiling shows it to be a performance problem.

That said... I suspect method 1 will be faster. There is some---admittedly small---overhead to parse the sprintf format-string. And strcat is more likely "inline-able".

ijprest
  • 4,208
  • 1
  • 19
  • 12
  • The `strcat` version scans the full length of the `first` string four times, whereas the `sprintf` version only does so twice. So when the `first` string is very very long, the `strcat` version will eventually end up slower. – caf Sep 07 '09 at 01:42
1

sprintf() is designed to handle far more than just strings, strcat() is specialist. But I suspect that you are sweating the small stuff. C strings are fundamentally inefficient in ways that make the differences between these two proposed methods insignificant. Read "Back to Basics" by Joel Spolsky for the gory details.

This is an instance where C++ generally performs better than C. For heavy weight string handling using std::string is likely to be more efficient and certainly safer.

[edit]

[2nd edit]Corrected code (too many iterations in C string implementation), timings, and conclusion change accordingly

I was surprised at Andrew Bainbridge's comment that std::string was slower, but he did not post complete code for this test case. I modified his (automating the timing) and added a std::string test. The test was on VC++ 2008 (native code) with default "Release" options (i.e. optimised), Athlon dual core, 2.6GHz. Results:

C string handling = 0.023000 seconds
sprintf           = 0.313000 seconds
std::string       = 0.500000 seconds

So here strcat() is faster by far (your milage may vary depending on compiler and options), despite the inherent inefficiency of the C string convention, and supports my original suggestion that sprintf() carries a lot of baggage not required for this purpose. It remains by far the least readable and safe however, so when performance is not critical, has little merit IMO.

I also tested a std::stringstream implementation, which was far slower again, but for complex string formatting still has merit.

Corrected code follows:

#include <ctime>
#include <cstdio>
#include <cstring>
#include <string>

void a(char *first, char *second, char *both)
{
    for (int i = 0; i != 1000000; i++)
    {
        strcpy(both, first);
        strcat(both, " ");
        strcat(both, second);
    }
}

void b(char *first, char *second, char *both)
{
    for (int i = 0; i != 1000000; i++)
        sprintf(both, "%s %s", first, second);
}

void c(char *first, char *second, char *both)
{
    std::string first_s(first) ;
    std::string second_s(second) ;
    std::string both_s(second) ;

    for (int i = 0; i != 1000000; i++)
        both_s = first_s + " " + second_s ;
}

int main(void)
{
    char* first= "First";
    char* second = "Second";
    char* both = (char*) malloc((strlen(first) + strlen(second) + 2) * sizeof(char));
    clock_t start ;

    start = clock() ;
    a(first, second, both);
    printf( "C string handling = %f seconds\n", (float)(clock() - start)/CLOCKS_PER_SEC) ;

    start = clock() ;
    b(first, second, both);
    printf( "sprintf           = %f seconds\n", (float)(clock() - start)/CLOCKS_PER_SEC) ;

    start = clock() ;
    c(first, second, both);
    printf( "std::string       = %f seconds\n", (float)(clock() - start)/CLOCKS_PER_SEC) ;

    return 0;
}
Clifford
  • 88,407
  • 13
  • 85
  • 165
  • A quick modification of my test (posted in a separate answer) revealed that converting method 1, with the malloc and free, into C++ using std::string was less than half the speed of the C version. The body of the loop was just "both = first + std::string(" ") + second;" However, the C++ is better in all kinds of other ways. – Andrew Bainbridge Sep 05 '09 at 18:37
  • Ah, reading the question again, I see how sprintf() would be faster that *two* strcat() calls, for the reasons mentioned in Joel's article. I am surprised that a std::string implementation was slower, but goes to show you have to measure if you need to know! – Clifford Sep 05 '09 at 20:51
  • Did you notice that method function a goes around its loop 48 times more than function b or function c? That was my dumb way of demonstrating the performance multiple. Posting the actual timings like you did is much more sensible. The timings I got on mingw gcc 4.4 (with the 48 times multiple removed) were: C string handling = 0.093000 seconds sprintf = 0.266000 seconds std::string = 0.766000 seconds And for Visual Studio 2005 (haven't got 2008 unfortunately): C string handling = 0.047000 seconds sprintf = 0.343000 seconds std::string = 0.485000 seconds – Andrew Bainbridge Sep 05 '09 at 22:36
  • Here are the timings (1000000 loop times for all) in a Core 2 Duo 2.0 GHz (all of them compiled without optimizations): Small strings: GCC 4.4: C string handling = 0.093 secs., sprintf = 0.328 secs, std::string = 1.560 secs. VC++ 2008: C string handling = 0.062 secs., sprintf = 0.296 secs., std::string = 1.498 secs. Intel C++ Compiler: C string handling = 0.109 secs. sprintf = 0.281 secs. std::string = 0.249 secs. Interesting results those of Intel's. – Xandy Sep 05 '09 at 23:27
  • Larger strings (120 and 140 characters each) and equal loops (1000000), all of them compiled from command line without optimizations (g++, cl and icl strings.cpp): GCC 4.4: C string handling = 0.250 secs., sprintf = 2.355 secs., std::string = 1.779 secs.; VC++ 2008: C string handling = 0.280 secs., sprintf = 2.216 secs., std::string = 4.836 secs.; Intel C++ Compiler: C string handling = 0.748 secs., sprintf = 2.465 secs., std::string = 3.214 secs. By the way, very interesting the article by Joel Spolsky. – Xandy Sep 05 '09 at 23:29
  • OK, lets get some perspective; The results are variable, but even the worst example had one million operations in less than 5 seconds. That's 5 microseconds per operation. Fast enough is good enough. There are situations where this may be critical, most often it is not. hence my suggestion about sweating teh small stuff. If you are moving a lot of strings, and need speed, measure it. Otherwise use what is simplest and safest. – Clifford Sep 06 '09 at 11:31
  • I might try an implementation using std::ostringstream for kicks and giggles. This is arguably more analogous to sprintf()'s behaviour. – Clifford Sep 07 '09 at 08:01
  • Sorry Andrew, I only just realised what you meant about the x48 loop! I have fixed the code and changed the timings and conclusion. Interesting how much worse GNU's std::string seems to be than VC++'s. – Clifford Sep 07 '09 at 20:07
  • In C the fastest way across several function implementations seems to be memcpy() (over strcpy()/strcat() and sprintf()). If using C++, std::stringstream can be another way to do it as Clifford points out (using redirectors << and then return str() for example). – Xandy Sep 07 '09 at 22:09
  • @Xandy: this benchmark is very bogus because gcc inlines `a()` into `main()` where it can see the compile-time constant strings and inline the hell out of `strcpy` / `strcat`. For example, gcc4.4 (https://godbolt.org/g/Fsvh9D) implements `strcpy(both, first)` with `mov DWORD [rbx], 0x73726946` (4 ASCII characters as immediate data for a store instruction), then another 2-byte store. It does call the actual `strlen` library function twice, but everything is is just very cheap mov-immediate and pointer addition. – Peter Cordes Dec 05 '17 at 04:51
  • By contrast, the `std::string` version calls functions for everything. It's using the copy-constructor, so I think it's actually deleting / allocating the storage for the string every iteration. I think the issue isn't that gcc is slow with `std::string`, it's that it optimizes *really well* for the C string case when you give it compile-time constant strings, so it makes everything else look bad. Of course, if you already need the string lengths for malloc, then memcpy and array assignment (like in the top answer) are much better than calling strcpy / strcat. – Peter Cordes Dec 05 '17 at 04:55
  • @PeterCordes : Eight years on, I am not sure it matters, but declaring the `first` and `second` to be `volatile` should fix your concerns I think. I have not changed the answer because that may change the results and I have no means of re-running it on the platform and compiler I used then. If I were looking at this again, I'd perhaps publish the results without optimisation, as for a general comparison that is probably better as "special cases" that may not apply generally would not be optimised away. – Clifford Dec 06 '17 at 10:01
  • @Clifford: Yeah, `volatile` might help, but IDK if it's the best choice. Designing microbenchmarks is hard; you usually have to look at the asm to see if it's realistic for a real use-case. Another option would be to use global `char*` variables, so `main` can't assume they have their original values unless you use whole-program optimization. (Any static initializer could have modified them). But `main` could still hoist `strlen` of the 2 inputs out of the benchmark loop. Another thing that could be useful is `__attribute__((noinline))`. – Peter Cordes Dec 06 '17 at 10:06
  • Disabling optimization is never useful for benchmarks like this: it would make C++ `std::string` run like crap because it depends on lots of layers inlining and optimizing away to nothing. – Peter Cordes Dec 06 '17 at 10:06
  • @PeterCordes : I think in a "complexity comparison" rather then "is it fast enough for my application" test, un-optimised remains fair - the question is "which is faster", not "how fast is it". The optimiser can spot special cases that may not apply generally and skew the comparison. My comment of *Sep 6 '09 at 11:31* applies when you have specific performance requirements for specific code. Moreover C++ is of interest to me perhaps, but this si strictly a C question, and I should probably not have included it with hindsight and greater SO experience. – Clifford Dec 06 '17 at 10:11
  • You're implying that the fastest version with `-O0` tells you which will be faster with `-O3`. [That is not how optimization works](https://stackoverflow.com/questions/32000917/c-loop-optimization-help-for-final-assignment/32001196#32001196), especially not when comparing C++ template / container stuff against calls to non-inline library functions. For a different microbenchmark where `std::string` was fastest with `-O3`, compiling with `-O0` could easily make it slower than calling still-optimized library functions. TL:DR: some things optimize a lot more than others (even in C). – Peter Cordes Dec 06 '17 at 10:15
  • @PeterCordes : I am not sure I am implying anything. I edited my comment probably while you were responding - the C++ result was of academic interest to me at the time, but the question is clearly C specific. Caution is required in all these tests as you suggest - possibly both tests should be performed in case one method is faster or slower in particular circumstances. We may be in _violent agreement_ here. I think you can say that one method is _typically_ or _potentially_ faster then another and perhaps no more. – Clifford Dec 06 '17 at 10:22
  • Your comment still says that un-optimized is a reasonable way to find out "which is faster". That's not true in general, and I'd be cautious even in this case for one `sprintf` vs. multiple simpler library functions. The right way to make a microbenchmark is to block any optimizations you don't want the compiler to find, not to disable optimizations everywhere. – Peter Cordes Dec 06 '17 at 10:30
  • The only thing I'm sure about from a performance perspective is that if you care about perf, use the `memcpy` / `s[len1] = ' ';` / `memcpy` sequence from Christoph's answer. As an expert in perf tuning and looking at compiler output, it's obvious to me that it will compile to asm at least as good as anything anyone else has suggested. (Not counting changing the problem by using explicit-length strings like `std::string` in the first place, especially if you have long strings.) – Peter Cordes Dec 06 '17 at 10:34
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/160595/discussion-between-clifford-and-peter-cordes). – Clifford Dec 06 '17 at 10:51
0

I don't know that in case two there's any real concatenation done. Printing them back to back doesn't constitute concatenation.

Tell me though, which would be faster:

1) a) copy string A to new buffer b) copy string B to buffer c) copy buffer to output buffer

or

1)copy string A to output buffer b) copy string b to output buffer

San Jacinto
  • 8,774
  • 5
  • 43
  • 58
  • The OP is proposing `sprintf(dst, "%s %s", first, second)` to concat in memory, not regular printf. For what you're suggesting, probably one call to `printf` would be the fastest way, because `fputs` / `putchar` / `fputs` has to lock / unlock `stdout` (or whatever output stream) multiple times. – Peter Cordes Dec 05 '17 at 04:59
0
  • strcpy and strcat are much simpler oprations compared to sprintf, which needs to parse the format string
  • strcpy and strcat are small so they will generally be inlined by the compilers, saving even one more extra function call overhead. For example, in llvm strcat will be inlined using a strlen to find copy starting position, followed by a simple store instruction
Tong Zhou
  • 566
  • 5
  • 7
-1

Neither is terribly efficient since both methods have to calculate the string length or scan it each time. Instead, since you calculate the strlen()s of the individual strings anyway, put them in variables and then just strncpy() twice.

dajobe
  • 4,938
  • 35
  • 41