3

I am creating a hex to binary converting functions in c++ with a little efficiency in mind, and as iostream's hex specifier does not work for chars (!?!), I decided to use sscanf. It reads from one vector and writes into another vector. The problem is, that in one of the cases, the program is extremely slow. this is the example: It copies a ~260000 character string into the vector in And then in the loop two hex characters from the input are converted into one byte in the output.

The string is in the file "key.h". it has something like this:

const char *key={
"6e7fdb71e32c5f8420100028201010.... {260000 characters more} ..."}

,

#include <cstring>
#include <vector>
#include <stdlib.h>
#include <stdio.h>

//262150 kilobyte file, constainig a large string, assigned to the variable key
#include "key.h"

//hex to bin conversion demo
int main()
{
    std::vector<char> in;
    std::vector<char> out;

    //output buffer size is half of the input buffer size
    out.resize(strlen(key)/2);

    //read hex string into vector
    in.insert(in.begin(),&key[0],&key[strlen(key)]);

    for (unsigned int bi = 0; bi < in.size()/2; ++bi) {

#if 0   //this version is VERY slow
        const char *ptr = &in[2*bi];
#else   //but not this
        char ptr[2];
        memcpy(ptr,&in[2*bi],2);
#endif
        sscanf(ptr, "%2hhX", &out[bi]);
    }
}

Now the problem: In this state the program gives following results:

time ./test

real    0m0.024s
user    0m0.024s
sys 0m0.000s

Changing #if 0 to #if 1 and... magic:

time ./test

real    0m0.611s
user    0m0.610s
sys 0m0.001s

It is 25 times slower. The only difference is, that in the second version sscanf receives pointer to the input vector memory

In the first case, the vector contents is copied to a local buffer on the stack and then sscanf is called with pointer to this buffer.

So, why is the second case so slow?

Nuclear
  • 1,316
  • 1
  • 11
  • 17
  • I would hazard a guess that `const char *ptr = &in[2*bi];` is actually pointing to the entire content of the vector from the index of `2*bi`. This could be causing `sscanf()` to run slowly. – NathanOliver Feb 26 '15 at 15:12
  • @NathanOliver you are right. And the reason can be found here: http://stackoverflow.com/questions/23923924/why-is-glibcs-sscanf-vastly-slower-than-fscanf-on-linux – Qmick Zh Feb 26 '15 at 15:50

0 Answers0