I took the liberty of expanding your test program out a little bit. In particular, I added code to have it generate its data internally instead of depending on an outside file, added timing code to isolate the string handling in question, and had it do both the C++ and C versions of the string manipulation in the same run, so it immediately produced results I could compare. That gave me the following code:
#include <iostream>
#include <fstream>
#include <vector>
#include <string>
#include <cstdlib>
#include <cstring>
#include <ctime>
char* compose_c(const char* name, const char* domain)
{
char* res = (char*) malloc(strlen(name)+strlen(domain)+2);
char* p = strcpy(res,name);
p += strlen(name);
*p = '@';
strcpy(p+1,domain);
return res;
}
std::string rand_string(int size){
std::string ret;
for (int i = 0; i < size; i++)
ret.push_back(rand() % 10 + '0');
return ret;
}
struct address {
std::string email, domain;
address() : email(rand_string(5)), domain(rand_string(4) + ".com") { }
};
struct composed {
std::string addr;
composed(address const &a) : addr(a.email + "@" + a.domain) {}
};
void report(clock_t d, std::string const &label){
std::cout << double(d) / CLOCKS_PER_SEC << " seconds for " << label << "\n";
}
int main(int argc, char **argv) {
static const int NUM = 1024 * 1024;
std::vector<address> addresses(NUM);
clock_t start = clock();
{
std::vector<composed> c{ addresses.begin(), addresses.end() };
}
report(clock() - start, "C++");
std::vector<char *> c_results(addresses.size());
clock_t start_c = clock();
for (int i = 0; i < addresses.size(); i++)
c_results[i] = compose_c(addresses[i].email.c_str(), addresses[i].domain.c_str());
for (char *c : c_results)
free(c);
report(clock() - start_c, "C");
}
I then compiled that with VC++ 2013, x64 using the flags: -O2b2 -GL -EHsc
. When I ran that, the result I got was:
0.071 seconds for C++
0.12 seconds for C
While there's some variation from run to run, those are fairly representative results--the C code takes almost (but not quite) twice as long as the C++ code.
Note that this is despite the fact that I've actually given the C version a bit of an unfair advantage. The timing for the C++ code includes not only the time to do the string manipulation, but also the time to create and destroy the vector to hold the results. For the C code, I prepare the vector ahead of time, then time only the code to create and destroy the strings themselves.
To ensure this result wasn't a fluke, I also tried changing some of the independent variables. For example, increasing the number of address strings we compose by a factor of 10 increased the overall time, but had almost no effect on the ratio:
0.714 seconds for C++
1.206 seconds for C
Likewise, changing the order so the C code ran first, and the C++ code ran second had no discernible effect.
I suppose I should add: it's true that as it stands, this code doesn't actually use the compose_cpp
function as the original did, choosing to incorporate the functionality into the constructor for composed
instead. For the sake of completeness, I did write a version that used compose_cpp
, like this:
std::vector<std::string> composed;
composed.reserve(NUM);
clock_t start = clock();
for (auto const &a : addresses)
composed.push_back(compose_cpp(a.email, a.domain));
This actually improves the timing a little bit, but I'd guess it's mostly due to moving the time to create the vector itself out of the timing code, and doesn't make a big enough difference to care much about:
0.631 seconds for C++
1.21 seconds for C
These results do depend heavily upon the standard library implementation though--specifically, the fact that std::string
implements the short-string optimization. Running the same code on the same hardware, but using an implementation that lacks this optimization (the nuwen MinGW distribution of gcc 4.9.1, in my case) gives quite different results:
2.689 seconds for C++
1.131 seconds for C
In this case, the C code is a little faster than the code from VC++, but the C++ code has slowed by a factor of about 4. I tried some different compiler flags (-O2 vs. -O3, etc.) but they had only minimal effect--for this test, the lack of short string optimization clearly dominates the other factors.
Bottom line: I think this confirms that the C++ code can be substantially faster than the C code, but achieving that speed depends much more on the quality of implementation. If the implementation fails to provide the short string optimization, the C++ code can easily be 2x slower instead of 2x faster than the C version.