0

I have a templated function for returning the number of digits in a number:

    template <typename R>
    static inline unsigned count(const R num)
    {
        if(num < 10)                    return 1;
        else if (num < 100)             return 2;
        else if (num < 1000)            return 3;
        else if (num < 10000)           return 4;
        else if (num < 100000)          return 5;
        else if (num < 1000000)         return 6;
        else if (num < 10000000)        return 7;
        else if (num < 100000000)       return 8;
        else if (num < 1000000000)      return 9;
        else if (num < 10000000000ULL)          return 10;
        else if (num < 100000000000ULL)         return 11;
        else if (num < 1000000000000ULL)        return 12;
        else if (num < 10000000000000ULL)       return 13;
        else if (num < 100000000000000ULL)      return 14;
        else if (num < 1000000000000000ULL)     return 15;
        else if (num < 10000000000000000ULL)    return 16;
        else if (num < 100000000000000000ULL)   return 17;
        else if (num < 1000000000000000000ULL)  return 18;
        else if (num < 10000000000000000000ULL) return 19;
        else                                    return 20;
    }

However when I compile (GCC) I get the following warning:

warning: comparison is always true due to limited range of data type

I understand why I get this repeatedly but I'm not sure how to suppress/avoid it.

Any thoughts?

Graeme
  • 4,514
  • 5
  • 43
  • 71
  • While not an answer to the question, you can look at the solutions in this [similar question](http://stackoverflow.com/questions/554521/how-can-i-count-the-digits-in-an-integer-without-a-string-cast). I don't think the log solutions would be good, but the ones like SLaks answer here are probably fast. – Mark Wilkins Jan 06 '11 at 18:42
  • 2
    Maybe you should use the standard facilities instead of calculating manually. How about `count = (int)log10(num) + 1;` – Gene Bushuyev Jan 06 '11 at 18:56

11 Answers11

2

You can avoid the warning be rewriting your method as

unsigned long long max = 10;
int order = 1;
while(num >= max && max * 10 > max) {
    max *= 10;
    order++;
}
return order;

I don't know whether this would be faster or slower.

SLaks
  • 868,454
  • 176
  • 1,908
  • 1,964
  • This would be slower given you have to calculate the value on each call to the function. – Graeme Jan 06 '11 at 18:37
  • 2
    @Graeme: Not necessarily. Those branches you got there are quite expensive, and they're not even binary sorted. – Puppy Jan 06 '11 at 18:39
  • There is one decision per digit in my lookup solution, this solution has 1 per loop (loop condition) as well as a multiplcation and post-increment. – Graeme Jan 06 '11 at 18:43
  • 1
    @Graeme: Except this solution's instructions are all loaded at once, whereas yours all have to be loaded and executed separately. Get a profiler. – Puppy Jan 06 '11 at 18:57
  • -1: This will overflow max and loop infinitely if `num` is between the max value of an `R` and the largest multiple of 10 that fits in an `R`. http://codepad.org/HM2SG9lE – Steve M Jan 06 '11 at 19:17
  • @SLaks: I gave you back your point, but now you're doing two multiplications and two comparisons per iteration, which is almost certainly slower than one comparison and one division per iteration. – Steve M Jan 06 '11 at 19:23
  • @Steve: I'm relying on the optimizer there. I didn't check, but I'm hoping that it will reuse the multiplication. – SLaks Jan 06 '11 at 19:24
  • 2
    Re perf: seriously, is this your bottleneck? The two approaches have asymptotic run time complexity (linear in both cases); the loop the added benefit of being fixed code size O(1) rather than O(lg(max)) with the conditional chain. – Aaron Jan 06 '11 at 19:42
2

-Wtype-limits warnings can be suppressed case-by-case by wrapping comparisons with constants into a dummy binary function, which accepts operands to be compared. In this case the code above can be transformed into something like:

// Dummy function, which suppresses -Wtype-limits warnings
// where appropriate.
template <typename R>
static inline bool dummy_less(const R a, const R b)
{
  return (a < b);
}

template <typename R>
static inline unsigned count(const R num)
{
    if (dummy_less(num, 10))        return 1;
    else if (dummy_less(num, 100))  return 2;
    // ...
    else                            return 20;
}

The compiler should easily propagate constants into dummy functions.

valyala
  • 11,669
  • 1
  • 59
  • 62
1

You could specialize or overload count() for types that aren't as large as unsigned long long.

Puppy
  • 144,682
  • 38
  • 256
  • 465
  • I currently do this, i.e. an overload for uint8, uint16, uint32 and uint64. I'm looking to consolidate into a single function. – Graeme Jan 06 '11 at 18:39
1

Apparently the type of your data (R) has smaller size than some of your constants in if()'s B.t.w. wouldn't it be better to have while(num/10) loop?

Example,

template <typename R>
unsigned count(R num)
{
    size_t i = 0;
    while(num /= 10)
        ++i;
    return i;
}
Gene Bushuyev
  • 5,512
  • 20
  • 19
  • +1: `while ( num /= 10 ) ++count;` is a great construct for this case! – David Rodríguez - dribeas Jan 06 '11 at 18:39
  • This needs to be performant, a lookup is alot quicker than calculating on every call when you have a small finite set of possible return values; i.e. in the range 1 to 20. – Graeme Jan 06 '11 at 18:40
  • 1
    Division is more expensive than multiplication. – SLaks Jan 06 '11 at 18:43
  • @Graeme -- performance? -- maybe, but rather unlikely, have you profiled to prove it's a bottleneck? – Gene Bushuyev Jan 06 '11 at 18:45
  • @SLaks -- that was long time ago. Since Pentium, processors have multiple pipelines and capable of doing several instructions per cycle. Performance in such code is all about memory, not about processor. – Gene Bushuyev Jan 06 '11 at 19:02
  • Yes, I just re-ran some performance and the lookup is conserably quicker, especially when moving towards larger numbers. – Graeme Jan 06 '11 at 19:03
  • @Graeme -- the question is not whether inlined `if();else;` is faster than an unrolled loop. The question is whether it's a bottleneck. It's hard to imagine an application where it would matter. – Gene Bushuyev Jan 06 '11 at 20:07
1

You could add the -Wno-type-limits to disable that specific warning. However, since it's a template function, you can't isolate that warning flag to a particular translation unit, so you'd have to enable that flag for your entire project, which may be undesirable.

Adam Rosenfield
  • 390,455
  • 97
  • 512
  • 589
1

If you don't care about user-defined integer types (and evidence suggests that you don't care about negative values, either), just define one function which takes the largest type that you care about:

inline unsigned count(unsigned long long num){
    if(num < 10)                    return 1;
    else if (num < 100)             return 2;
    // blah blah
    else return 20;
}

If you call it with a signed short or whatever, you won't get any warnings about the implicit conversion, since it's a widening.

static_cast<unsigned>(log10(num)) + 1 is also worth profiling.

Steve M
  • 8,246
  • 2
  • 25
  • 26
0

May be a static_cast would help (yes it looks horrid, but this approach hmm...)?

    else if (static_cast<unsigned long long>(num) < 10000000000ULL)          return 10;
    else if (static_cast<unsigned long long>(num) < 100000000000ULL)         return 11;
    else if (static_cast<unsigned long long>(num) < 1000000000000ULL)        return 12;
    else if (static_cast<unsigned long long>(num) < 10000000000000ULL)       return 13;
    else if (static_cast<unsigned long long>(num) < 100000000000000ULL)      return 14;
    else if (static_cast<unsigned long long>(num) < 1000000000000000ULL)     return 15;
    else if (static_cast<unsigned long long>(num) < 10000000000000000ULL)    return 16;
    else if (static_cast<unsigned long long>(num) < 100000000000000000ULL)   return 17;
    else if (static_cast<unsigned long long>(num) < 1000000000000000000ULL)  return 18;
    else if (static_cast<unsigned long long>(num) < 10000000000000000000ULL) return 19;

etc.

Nim
  • 33,299
  • 2
  • 62
  • 101
0
// Beware. brain-compiled code ahead!
namespace {
  inline unsigned count_long_long(unsigned long long num)
    if (num < 10000000000ULL)          return 10;
    else if (num < 100000000000ULL)         return 11;
    else if (num < 1000000000000ULL)        return 12;
    else if (num < 10000000000000ULL)       return 13;
    else if (num < 100000000000000ULL)      return 14;
    else if (num < 1000000000000000ULL)     return 15;
    else if (num < 10000000000000000ULL)    return 16;
    else if (num < 100000000000000000ULL)   return 17;
    else if (num < 1000000000000000000ULL)  return 18;
    else if (num < 10000000000000000000ULL) return 19;
    else                                    return 20;
  }
  template <typename R>
  inline unsigned count_long_long(const R num) {return 20;}
}

template <typename R>
inline unsigned count(const R num)
{
  if(num < 10)                    return 1;
  else if (num < 100)             return 2;
  else if (num < 1000)            return 3;
  else if (num < 10000)           return 4;
  else if (num < 100000)          return 5;
  else if (num < 1000000)         return 6;
  else if (num < 10000000)        return 7;
  else if (num < 100000000)       return 8;
  else if (num < 1000000000)      return 9;
  else return count_long_long(num);
}
sbi
  • 219,715
  • 46
  • 258
  • 445
  • What if `R` is a `char` or `short`? – UncleBens Jan 06 '11 at 23:07
  • @UncleBens: Of course, then you would have to divide the `count()` functions further using the same scheme. If there are many types to consider, it would make sense to resort to template-meta programming. But it seems there's only 8, 16, 32, and 64bit, so dividing according to this size should be enough. – sbi Jan 07 '11 at 00:16
0

Perhaps generate the right amount of comparisons at compile-time?

#include <limits>

template <class T, int power>
struct pow10
{
    static const T value = 10 * pow10<T, power - 1>::value;
};

template <class T>
struct pow10<T, 0>
{
    static const T value = 1;
};

template <class T, int power_of_ten, bool recurse>
struct digit_counter
{
    unsigned count(T value) const
    {
        if (value < pow10<T, power_of_ten>::value) return power_of_ten;
        else return digit_counter<T, power_of_ten + 1, power_of_ten < std::numeric_limits<T>::digits10>().count(value);
    }
};

template <class T, int power_of_ten>
struct digit_counter<T, power_of_ten, false>
{
    unsigned count(T ) const
    {
        return   std::numeric_limits<T>::digits10 + 1;
    }
};

template <class T>
unsigned count(T value)
{
    return digit_counter<T, 1, (std::numeric_limits<T>::digits10 > 1)>().count(value);
}

When optimized, it should yield pretty much identical binary to the original.

UncleBens
  • 40,819
  • 6
  • 57
  • 90
0

Since it's an inline function, why don't you trust the compiler a little and make it

static inline unsigned count(const R unsigned long long);

which should then promote its argument of whatever integer type to unsigned long long (standards pedants pick me up on this, but I assume it would) and run the code as given, which gives the same result as it would in your templated case.

In practice the compiler should (test it if you're bothered) remove the unnecessary conditions if they spot the input to have restricted range, which is in any case what you're hoping for in the templated case. The additional cost over templating is one promotion from <num> to unsigned long long, assuming, again, that your compiler's an idiot.

ijw
  • 4,376
  • 3
  • 25
  • 29
0

I have an issue with your approach.

Your code is not optimized, because you do a linear search instead of a binary search.

A quick example, say that your number is between 0 and 65535 (inclusive) you should probably use the following algorithm:

if num < 1000:
  if num < 100:
     if num < 10: return 1
     return 2
  return 3

if num < 10000: return 4
return 5

This is a binary search, there is at most 3 comparisons in the worst case (numbers of 1 or 2 digits) and all the others are only 2 comparisons.

This implies that there is information in the size of your integer, that should not be discarded thoughtlessly.

namespace detail
{
  template <int Size> struct number_bits {};

  template <typename Unsigned>
  number_bits< sizeof(Unsigned)*8 > count_number_bits(Unsigned)
  {
    return number_bits< sizeof(Unsigned)*8 >();
  }

  template <typename Unsigned>
  unsigned number_digits_helper(Unsigned t, number_bits<8>);

  template <typename Unsigned>
  unsigned number_digits_helper(Unsigned t, number_bits<16>);

  template <typename Unsigned>
  unsigned number_digits_helper(Unsigned t, number_bits<32>);

  template <typename Unsigned>
  unsigned number_digits_helper(Unsigned t, number_bits<64>);

  template <typename Unsigned>
  unsigned number_digits_helper(Unsigned t, number_bits<128>);

  template <typename Unsigned>
  unsigned number_digits_helper(Unsigned t, number_bits<256>);

} // namespace detail


template <typename Unsigned>
unsigned number_digits(Unsigned t)
{
  static_assert(!std::numeric_limits<Unsigned>::is_signed, "t is signed");
  return detail::number_digits_helper(t, detail::count_number_bits(t));
}

And now, you can fully optimize each of the routines per number of bits of the unsigned type (I know, there is no real reason there would be more than one unsigned type per size, would it ?)

For optimizing when the size is known, I recommend you have a look at Bit Twiddling Hacks which is all about micro optimizations.

Matthieu M.
  • 287,565
  • 48
  • 449
  • 722