1

I am using isdigit() function in c++, but i found it's slow, so i implemented my own is_digit(), see my code below:

#include<iostream>
#include<cctype>
#include<ctime>
using namespace std;
static inline bool is_digit(char c)
{
    return c>='0'&&c<='9';
}
int main()
{
    char c='8';
    time_t t1=clock(),t2,t3;
    for(int i=0;i<1e9;i++)
        is_digit(c);
    t2=clock();
    for(int i=0;i<1e9;i++)
        isdigit(c);
    t3=clock();
    cout<<"is_digit:"<<(t2-t1)/CLOCKS_PER_SEC<<"\nisdigit:"<<(t3-t2)/CLOCKS_PER_SEC<<endl;

    return 0;
}

After running, is_digit() took only 1 second(1161ms), but isdigit() took 4 seconds(3674ms), I know that isdigit is implemented by bit operation, Shouldn't isdigit() be faster than is_digit()?


update1

I use MS VS2010 with default option, release version, how do i do to make isdigit() faster than is_digit() in VS?

update2

Thanks to all of you. When in release mode in VS, project will be optimized for speed default(-O2).

All in release mode.

VS2010: is_digit:1182(ms) isdigit:3724(ms)

VS2013: is_digit:0(ms) isdigit:3806(ms)

Codeblocks with g++(4.7.1) with -O3: is_digit:1275(ms) isdigit:1331(ms)

So here is the conclusion:

is_digit() is faster than isdigit() in VS but slower than isdigit() in g++.

And isdigit() in g++ is faster than isdigit() in VS.

So "VS sucks" in performance?

user1024
  • 982
  • 4
  • 13
  • 26
  • 1
    How did you compile the code? With `g++ -O0` the libc version is faster for me. – vanza Feb 10 '15 at 06:19
  • 8
    How do you compile? Also, [`std::isdigit`](http://en.cppreference.com/w/cpp/locale/isdigit) considers `locale`s – Ivan Aksamentov - Drop Feb 10 '15 at 06:19
  • It might be that isdigit() gets "function called" and your is_digit() gets inlined. – BitTickler Feb 10 '15 at 06:19
  • 1
    As vanza mentioned above, what are your compiler settings? Also, what happens when you make it a non-inline function? – Cloud Feb 10 '15 at 06:19
  • 2
    Findings for gcc: With -O0, your version is twice as slow as the libc version. With -O3, both calls are optimized out, because the results are not used. – Markus Mayr Feb 10 '15 at 06:24
  • on my bsd vm with clang and libc++ both are "4" with -O0 and with -O3 they are optimized away :) – BitTickler Feb 10 '15 at 06:27
  • I can't imagine those loops would even exist in a fully optimized binary. Also, `isdigit` does more work than your version does. – Ed S. Feb 10 '15 at 06:30
  • For me; with -O3 (g++) is_digit:1 isdigit:1 with -O0 is_digit:3 isdigit:1 – Doonyx Feb 10 '15 at 06:35
  • Well I compile my code with MS VS2010 with default option. BTY what do -O3 and -O0 mean? – user1024 Feb 10 '15 at 06:46
  • @Dogbert when I make it a non-inline function, `is_digit()` still faster than `isdigit()`? How can i add -O3 like option in VS? – user1024 Feb 10 '15 at 07:07
  • Most probably you are at debug mode and i does a lot of unneccasary checks and asserts because of that.But again 4 sec is too much :D – oknsnl Feb 10 '15 at 07:09
  • @oknsnl I am at release mode, but after i set -O2 flag in VS like -O3 in g++, `is_digit()` still beats `isdigit()`, and `isdigit()` took more time in VS than in g++. – user1024 Feb 10 '15 at 07:20
  • 3
    Comparing non-optimized builds is pointless. And you must use the result of the function or it'll be optimized out and the time needed to run the loop is 0. Besides, to measure performance correctly, use [`QueryPerformanceCounter`](http://stackoverflow.com/questions/1739259/how-to-use-queryperformancecounter) https://msdn.microsoft.com/en-us/library/windows/desktop/ms644904%28v=vs.85%29.aspx – phuclv Feb 10 '15 at 08:17
  • Minor note: `isdigit` receives an `int`, **not** `char` – phuclv Feb 10 '15 at 08:42

4 Answers4

2

In clang/llvm [the compiler of my choice], isdigit and is_digit will turn into exactly the same code, as it has optimisation for that specific library call to translate it into ((unsigned)(c-48) < 10u).

The return c>='0' && c <='9'; is also turned into c-48 > 10 by the optimisation (as a generic if x >= N && x <= M -> x-N > (M-N) conversion that the compiler does).

So, in theory, both the loops SHOULD turn into the same code (at least with a compiler that has this type of optimisation for isdigit - whether MSVC does or not, I can't say, as the source code is not available to the general public). I know that gcc has similar code to optimise library calls, but I don't have gcc source on my machine at present, and I can't be bothered to look it up [in my experience, it'll be a bit more difficult to read than the llvm code, anyways].

Code in llvm:

Value *LibCallSimplifier::optimizeIsDigit(CallInst *CI, IRBuilder<> &B) {
  Function *Callee = CI->getCalledFunction();
  FunctionType *FT = Callee->getFunctionType();
  // We require integer(i32)
  if (FT->getNumParams() != 1 || !FT->getReturnType()->isIntegerTy() ||
      !FT->getParamType(0)->isIntegerTy(32))
    return nullptr;

  // isdigit(c) -> (c-'0') <u 10
  Value *Op = CI->getArgOperand(0);
  Op = B.CreateSub(Op, B.getInt32('0'), "isdigittmp");
  Op = B.CreateICmpULT(Op, B.getInt32(10), "isdigit");
  return B.CreateZExt(Op, CI->getType());
}

For those not familiar with LLVM code: It first checks that the function call has the correct number of parameters and parameter types. If that fails, it returns NULL to indicate "I can't optimise this". Otherwise, it builds the chain of operations to do the if (c - '0' > 10) using unsigned comparison to cope with "negative" values [which in unsigned are huge values].

It would goes wrong if you do this:

bool isdigit(int x)
{
   return image_contains_finger(imagefiles[x]); 
}

[But then replacing library functions with your own version that does something will most likely have interesting effects in general!]

Mats Petersson
  • 126,704
  • 14
  • 140
  • 227
2

Your function is_digit can be implemented faster by:

#define ISDIGIT(X) (((uint32_t)X - '0') < 10u)

where you save one comparison. I think, that this is the normal approch in gcc, but in Microsoft Visual Studio i guess you had a localized version of isdigit() (which therefor takes a long time in checking locales).

PatrickF
  • 594
  • 2
  • 11
1

Have a look on this code (works with g++) with -O3

#include<iostream>
#include<cctype>
#include<ctime>
#include <time.h>
#include <sys/time.h>
using namespace std;
static inline bool is_digit(char c)
{
    return c>='0'&&c<='9';
}
int main()
{
    char c='8';
    struct timeval tvSt, tvEn;
    time_t t1=clock(),t2,t3;
    gettimeofday(&tvSt, 0);
    for(int i=0;i<1e9;i++)
        is_digit(c);
    gettimeofday(&tvEn, 0);
    cout << "is_digit:" << (tvEn.tv_sec - tvSt.tv_sec)*1000000 + (tvEn.tv_usec - tvSt.tv_usec) << " us"<< endl;
    gettimeofday(&tvSt, 0);
    for(int i=0;i<1e9;i++)
        isdigit(c);
    gettimeofday(&tvEn, 0);
    cout << "isdigit:" << (tvEn.tv_sec - tvSt.tv_sec)*1000000 + (tvEn.tv_usec - tvSt.tv_usec) << " us"<< endl;

    return 0;
}

Results:

is_digit:1610771 us
isdigit:1055976 us

So, C++ implementation beats yours.
Normally, when you measure performance, it's not a good idea to do it with seconds. At lease consider microseconds level.

I'm not sure about VS. Please find out microsecond level clock and measure.

PS. Please refer https://msdn.microsoft.com/en-us/library/19z1t1wy.aspx for VS optimizations

Doonyx
  • 580
  • 3
  • 12
  • What the `-O3` option means? By the way, how can i do to make `isdigit()` faster than `is_digit()` in VS? – user1024 Feb 10 '15 at 06:54
  • @user1024 -O3 is an optimization flag used in g++. It allows a higher level of compile time optimization. There should be some optimization flags in VS as well. I don't know much about VS. However, if you your simple code beats VS implementation, that should be another reason to say "VS sucks" in terms of performance. I'm not sure. – Doonyx Feb 10 '15 at 07:00
  • @user1024 maybe https://msdn.microsoft.com/en-us/library/19z1t1wy.aspx explains them? – Doonyx Feb 10 '15 at 07:05
  • I see, but after i set -O2 flag in VS (like -O3 in g++), `is_digit()` still beats `isdigit()`, and `isdigit()` took more time in VS than in g++, "VS sucks" may be true. What's strange is that after i set -O2 flag in VS, 'is_digit()` took 0 us, is that because char c always be '8'? – user1024 Feb 10 '15 at 07:22
  • Not convinced this is measuring anything. Both loops are optimized out entirely by any decent optimizer. And [I'm extremely doubtful that you compiled your code with `-O3`](http://coliru.stacked-crooked.com/a/728485a5b4d7dba3). – T.C. Feb 10 '15 at 07:30
  • @T.C. I have no clue why you doubt using -O3? I have no idea how -O2 comes from your link. But I compiled in it my Linux box with -O3. You need a screen capture to prove this. On the other hand, if you are "Not convinced", please present your facts rather just saying "Not convinced". – Doonyx Feb 10 '15 at 07:36
  • @user1024 Can you measure the time in microsecond rather seconds? – Doonyx Feb 10 '15 at 07:38
  • Sure, [`-O3`](http://coliru.stacked-crooked.com/a/9bca6df1ec46d1f6), doesn't change a thing. Which version of GCC you are using? – T.C. Feb 10 '15 at 07:41
  • @T.C. Moreover, how could you say which optimization flags which I used by just looking at the code? – Doonyx Feb 10 '15 at 07:42
  • @T.C. gcc version 4.7.2 – Doonyx Feb 10 '15 at 07:44
  • 4
    Because I expected the optimizer to take out the loop entirely, and not waste millions of microseconds running the loop. Turns out that before GCC 4.8 [it optimizes away the function call but not the loop itself when your loop condition does a comparison with the floating point number `1e9`](http://goo.gl/Iw85mk), but does take out the loop too [when you use `1000000000` instead](http://goo.gl/l37WFA). Regardless, the `is_digit`/`isdigit` calls are still optimized away, so you are still not measuring anything meaningful. – T.C. Feb 10 '15 at 07:54
  • 1
    @SargiEran Thanks to you. `isdigit()` is faster than `is_digit()` in g++. See my update. I am going to use g++. – user1024 Feb 10 '15 at 08:19
  • 1
    it's not strange at all. Any optimized compilers will eliminate those non-sense loops away and the result should be 0. It's not that VS is stupid but because your code is a very bad performance checker – phuclv Feb 10 '15 at 08:29
  • I agree with @phuclv, you need to run a better test, certainly not with a constant char. I'll have to make one myself to check the real speed. – Octo Poulos Dec 30 '22 at 16:35
0

The proposed benchmark from @Doonyx fell into several pitfalls:

  1. using a constant char c = '8'; ... any compiler would understand it doesn't change and could cache or skip the result.
  2. the loop is running a function but the result is not used anywhere => again, compilers can just skip the loop altogether.
  3. it doesn't take into account the CPU performance delta, the CPU could take some time to "wake up", and generally its performance can very over time.

=> I made modifications of that benchmark to solve all 3 points.

// gcc main.cpp -O3 -std=c++20 -lstdc++ && ./a.out

#include <chrono>
#include <iomanip>
#include <iostream>
#include <map>
#include <vector>

// basic function
static inline bool is_digit(char c)
{
    return c >= '0' && c <= '9';
}

// optimized function
constexpr bool is_digit2(int c)
{
    return (uint32_t)(c - '0') < 10u;
}

constexpr int NUM_STEP = 8;
constexpr int TRIM     = 2;

#define NOW_NS() std::chrono::high_resolution_clock::now().time_since_epoch().count()

int main()
{
    int64_t                                     sum;
    std::map<std::string, std::vector<int64_t>> nameTimes;
    std::map<std::string, int64_t>              nameAvgs;

// convenience define to run the benchmark
#define RUN_BENCH(name, code)                                                        \
    do                                                                               \
    {                                                                                \
        const auto start = NOW_NS();                                                 \
        sum              = 0;                                                        \
        for (int i = 0; i < 1000000000; ++i)                                         \
            sum += code;                                                             \
        const auto name##Time = NOW_NS() - start;                                    \
        nameTimes[#name].push_back(name##Time);                                      \
        std::cout << step << " " << std::setw(11) << #name << ": "                   \
                << std::setw(10) << name##Time << " ns  sum=" << sum << std::endl; \
    }                                                                                \
    while (0)

    // 1) run the benchmark NUM_STEP times
    // note that a null test is added to compute the overhead
    for (int step = 0; step < NUM_STEP; ++step)
    {
        RUN_BENCH(_null, i & 15);
        RUN_BENCH(is_digit, is_digit(i & 255));
        RUN_BENCH(is_digit2, is_digit2(i & 255));
        RUN_BENCH(std_isdigit, std::isdigit(i & 255));
    }

    // 2) remove the 25% slowest and 25% fastest runs for each benchmark (Interquartile range)
    std::cout << "\ncombining:\n";
    for (auto& [name, times] : nameTimes)
    {
        int64_t total = 0;
        std::sort(times.begin(), times.end());
        std::cout << std::setw(11) << name;
        for (int i = 0; i < NUM_STEP; ++i)
        {
            std::cout << " " << i << ":" << times[i];
            if (i >= TRIM && i < NUM_STEP - TRIM)
            {
                std::cout << "*";
                total += times[i];
            }
        }
        total /= (NUM_STEP - TRIM * 2);
        std::cout << " => " << total << " ns\n";
        nameAvgs[name] = total;
    }

    // 3) show the results + results MINUS the overhead (null time)
    std::cout << "\nsummary:\n";
    for (auto& [name, time] : nameAvgs)
    {
        std::cout << std::setw(11) << name << ": " << std::setw(10) << time << " ns "
                << " time-null: " << std::setw(10) << time - nameAvgs["_null"] << " ns\n";
    }

    return 0;
}

So, each benchmark is a bit more complex and forces the compiler to actually execute the code, they're run sequentially, and then 8 times, to take the CPU performance variation into account, and then the slowest/fastest runs are discarded, and in the final summary, the time of the overhead is subtracted, to have an idea of the true speed of the functions.

gcc 11.2.0 with -O0:
      _null:  680327226 ns  time-null:          0 ns
   is_digit: 1368190759 ns  time-null:  687863533 ns
  is_digit2: 1223091465 ns  time-null:  542764239 ns
std_isdigit:  733283544 ns  time-null:   52956318 ns *

msvc 17.3.4 with -O0:
      _null:  576647075 ns  time-null:          0 ns
   is_digit: 1348345625 ns  time-null:  771698550 ns
  is_digit2:  754253650 ns  time-null:  177606575 ns *
std_isdigit: 1619403975 ns  time-null: 1042756900 ns

gcc 11.2.0 with -O1:
      _null:  217714988 ns  time-null:          0 ns
   is_digit:  459088203 ns  time-null:  241373215 ns
  is_digit2:  434988334 ns  time-null:  217273346 ns *
std_isdigit:  435391905 ns  time-null:  217676917 ns *

msvc 17.3.4 with -O1:
      _null:  217425875 ns  time-null:          0 ns
   is_digit:  442688400 ns  time-null:  225262525 ns *
  is_digit2:  440954975 ns  time-null:  223529100 ns *
std_isdigit: 1187352900 ns  time-null:  969927025 ns

gcc 11.2.0 with -O2:
      _null:  217411308 ns  time-null:          0 ns
   is_digit:  542259068 ns  time-null:  324847760 ns
  is_digit2:  434180245 ns  time-null:  216768937 ns *
std_isdigit:  435705056 ns  time-null:  218293748 ns *

msvc 17.3.4 with -O2:
      _null:  209602025 ns  time-null:          0 ns
   is_digit:  441704325 ns  time-null:  232102300 ns
  is_digit2:  298747075 ns  time-null:   89145050 ns *
std_isdigit: 1198361400 ns  time-null:  988759375 ns

gcc 11.2.0 with -O3:
      _null:  126789606 ns  time-null:          0 ns
   is_digit:  206127551 ns  time-null:   79337945 ns
  is_digit2:  175606336 ns  time-null:   48816730 ns *
std_isdigit:  174991923 ns  time-null:   48202317 ns *

msvc 17.3.4 with -Ox:
      _null:  206283850 ns  time-null:          0 ns
   is_digit:  434584200 ns  time-null:  228300350 ns
  is_digit2:  312153225 ns  time-null:  105869375 ns *
std_isdigit: 1176565150 ns  time-null:  970281300 ns

Conclusion:

  • on gcc, std::isdigit is as fast as the is_digit2 function
  • on msvc, std::isdigit is 9x slower than is_digit2 (but this might be due to a locale setting)
Octo Poulos
  • 523
  • 6
  • 5