I'm diagnosing an edge case in a cross platform (Windows and Linux) application where toupper is substantially slower on Windows. I'm assuming this is the same for tolower as well.
Originally I tested this with a simple C program on each without locale information set or even including the header file and there was very little performance difference. Test was a million iteration loop calling each character for a string to the toupper() function.
After including the header file and including the line below it's much slower and calls a lot of the MS C runtime library locale specific functions. This is fine but the performance hit is really bad. On Linux this doesn't appear to have any affect at all on performance.
setlocale(LC_ALL, ""); // system default locale
If I set the following it runs as fast as linux but does appear to skip all the locale functions.
setlocale(LC_ALL, NULL); // should be interpreted as the same as below?
OR
setlocale(LC_ALL, "C");
Note: Visual Studio 2015 for Windows 10 G++ for Linux running Cent OS
Have tried dutch settings settings and same outcome, slow on Windows no speed difference on Linux.
Am I doing something wrong or is there a bug with the locale settings on Windows or is it the other way where linux isn't doing what it should? I haven't done a debug on the linux app as I'm not as familiar with linux so do not know exactly what it's doing internally. What should I test next to sort this out?
Code below for testing (Linux):
// C++ is only used for timing. The original program is in C.
#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#include <chrono>
#include <locale.h>
using namespace std::chrono;
void strToUpper(char *strVal);
int main()
{
typedef high_resolution_clock Clock;
high_resolution_clock::time_point t1 = Clock::now();
// set locale
//setlocale(LC_ALL,"nl_NL");
setlocale(LC_ALL,"en_US");
// testing string
char str[] = "the quick brown fox jumps over the lazy dog";
for (int i = 0; i < 1000000; i++)
{
strToUpper(str);
}
high_resolution_clock::time_point t2 = Clock::now();
duration<double> time_span = duration_cast<duration<double>>(t2 - t1);
printf("chrono time %2.6f:\n",time_span.count());
}
void strToUpper(char *strVal)
{
unsigned char *t;
t = (unsigned char *)strVal;
while (*t)
{
*t = toupper(*t);
*t++;
}
}
For windows change the local information to:
// set locale
//setlocale(LC_ALL,"nld_nld");
setlocale(LC_ALL, "english_us");
You can see the locale change from the separator in the time completed, full stop vs comma.
EDIT - Profiling data
As you can see above most of the time spent in child system calls from _toupper_l.
Without the locale information set the toupper call does NOT call the child _toupper_l which makes it very quick.