13

Is there a fairly standard C (Linux) function, or a code-efficient but good-performing approach, for comparing two integers of an arbitrary size?

I'm looking for something with the parameters int intcmp(const void *a, const void *b, size_t size) that works on integers a and b for any practical size size. (memcmp() would work (I think) if the architecture was big endian.)

The implementation I tend to use goes like this (with improvements from Efficient integer compare function) but it's not completely generic and has enough of a code overhead that I generally think twice before slotting it in.

int intcmp(const void *a, const void *b, size_t size) {

    #define CASE_SIZE_RETURN_A_B_CMP(_t) \
        case sizeof(_t): \
            return ((*(_t *)(a) > *(_t *)(b)) - (*(_t *)(a) < *(_t *)(b)))

    switch (size) {
    CASE_SIZE_RETURN_A_B_CMP(char);
    CASE_SIZE_RETURN_A_B_CMP(short);
    CASE_SIZE_RETURN_A_B_CMP(int);
    CASE_SIZE_RETURN_A_B_CMP(long long);
    }
    #undef CASE_SIZE_RETURN_A_B_CMP

    assert(0);
    return 0;
}
Community
  • 1
  • 1
antak
  • 19,481
  • 9
  • 72
  • 80
  • 4
    I doubt that there's a standard function for this. Note that your solution doesn't work if you need both signed and unsigned types. – Mark Ransom May 13 '13 at 05:43
  • 6
    As soon as you write of "integers of an arbitrary size", I think you're beyond the realm covered by "if the architecture was big endian", since that sort of low-level architectural consideration doesn't admit of integers of arbitrary size. Rather, you need to decide how you intend to represent (say) 1024-bit integers, and then write your function accordingly. – ruakh May 13 '13 at 05:46
  • 3
    Why not just use a simpler version of the macro inline where needed? – leppie May 13 '13 at 05:55
  • 1
    If your arbitrary size is limited to pre-defined datatypes (looks like that from your post), I would go with your approach, without macros, with the above macros you are not gaining anything, but making is more complicated to read. – Sudhee May 13 '13 at 06:58
  • 2
    If you need such a function in a C program, you *might* be doing it the wrong way, i.e. the code that would be calling that function might need to be redesigned. If you *really* need to do this, you should either consider using C++ and its templates or maybe C11's generic functions. If nothing else is possible, your solution is not so bad and is probably the best you can have (although I would probably not use a macro, despite code duplication). – Fabien May 13 '13 at 12:44
  • I know you acknowledged it's not completely generic. Specifically, you skipped `long` and you don't provide a way to distinguish between signed and unsigned types. – Keith Thompson May 13 '13 at 15:52
  • 1
    There's a `==` operator in C. It's "fairly standard". You forgot to tell us what's wrong with using it. – Cody Gray - on strike May 26 '13 at 06:16
  • @CodyGray: The problem with `==` is that it returns 1 or 0 but the requirement is to return -1 if the integer pointed at by `a` is less than the integer pointed at by `b`, or +1 for the other way around, or 0 if they are equal — like the comparators for `bsearch()` and `qsort()`. – Jonathan Leffler May 26 '13 at 06:20
  • @ruakh is right! You should specify what you mean for "integers of an arbitrary size". If you are looking for a very generic way, look at the straightforward Python implementation of integers/longs. – Markon May 26 '13 at 10:26
  • Moving comment from an answer removed: I'm designing a middleware API for an interface a bit like bsearch() and was considering the flexibility and relative buffer-overrun safety of providing the size_t argument verses pushing users towards rolling specific intcmpNs (or maybe they'll hack up a global variable for argument passing). Datatypes users deal with are often integers but sometimes can be fixed-length space-padded strings, so I'm leaning towards the third size_t argument so that they can slot in memcmp. – antak May 29 '13 at 03:40

4 Answers4

2

Static inline functions have the advantage of the arguments being evaluated only once (this is hard/impossible to do with macros) . This would allow function calls like int diff = cmp_all (p++, q++, sizeof *p); :

#include <stdlib.h>
#include <stdint.h>

static inline int cmp1(const int8_t *one, const int8_t *two)
{
if (*one < *two) return -1;
else if (*one > *two) return 1;
else return 0;
}

static inline int cmp2(const int16_t *one, const int16_t *two)
{
if (*one < *two) return -1;
else if (*one > *two) return 1;
else return 0;
}

static inline int cmp4(const int32_t *one, const int32_t *two)
{
if (*one < *two) return -1;
else if (*one > *two) return 1;
else return 0;
}

static inline int cmp8(const int64_t *one, const int64_t *two)
{
if (*one < *two) return -1;
else if (*one > *two) return 1;
else return 0;
}

int cmp_all(const void *one, const void *two, size_t size)
{
switch(size) {
case 1: return cmp1(one, two);
case 2: return cmp2(one, two);
case 4: return cmp4(one, two);
case 8: return cmp8(one, two);
default: return 0; /* that will teach them ... */
        }
}
wildplasser
  • 43,142
  • 8
  • 66
  • 109
1

If you really need to good performing comparation of integers of arbitrary sizes I recommend you look at The GNU Multiple Precision Arithmetic Library. That requires you to use it's special mpz_t type (which has the length included). Then you can use the function int mpz_cmp(mpz_t op1, mpz_t op2). Deciding on your own representation of big integers and implement it in way that is both fairly portable and efficient is not trivial.

If, on the other hand, you just need the standard integer sizes you mention I think your implementation is fine. But for even better portability you shouldn't make assumptions on the various integer sizes:

#include <stdint.h>

int intcmp(const void *a, const void *b, size_t size) {
    switch (size) {
    case 1: return (*(int8_t*)a > *(int8_t*)b) - (*(int8_t*)a < *(int8_t*)b)
    case 2: return (*(int16_t*)a > *(int16_t*)b) - (*(int16_t*)a < *(int16_t*)b)
    case 4: return (*(int32_t*)a > *(int32_t*)b) - (*(int32_t*)a < *(int32_t*)b)
    case 8: return (*(int64_t*)a > *(int64_t*)b) - (*(int64_t*)a < *(int64_t*)b)
    }

    assert(0);
    return 0;
}

Perhaps you'll find it better to create a separate function for each length you need instead of using the same for all? And finally, if efficiency is important, it is often less efficient to do arithmetics with char or short than with int. So try to avoid the cases where you need to call this function with char or short and use int instead.

Fabel
  • 1,711
  • 14
  • 36
  • The problem with that is that it invokes undefined behaviour when `a` is a large negative number and `b` is a large positive number, or vice versa. You run into signed arithmetic overflow, which is often benign in practice but is formally undefined behaviour and should be avoided at almost any cost. The names `sint8_t` et al are not defined in `` according to the C standard; the signed types as `int8_t` and the unsigned types are `uint8_t`. I sometimes use the `sintN_t` names, but I know that it is a non-standard extension. – Jonathan Leffler May 26 '13 at 06:58
  • Yes, you are correct. I've made the edits to reflect your comments. – Fabel May 26 '13 at 16:32
0

I think the following link will help. You make the comparison without using comparators keeping your code overhead down a bit. I have made use of the code associated with this link myself in the past.

-Good Hunting-

C program to compare integers without using logical operators?

Community
  • 1
  • 1
  • I would use a variation in production. Setting the print statements aside, I see reusability, and the function fits the task. – Peterson_913 May 13 '13 at 15:45
0

If the call site has the size available, I would prefer to use it as an index in a look-up table there, to just call the proper function right away.

unwind
  • 391,730
  • 64
  • 469
  • 606
  • It's an interesting idea, however I found http://stackoverflow.com/questions/2662442/c-function-pointers-vs-switch and http://stackoverflow.com/questions/2293655/if-switch-and-function-pointers-speed-comparison which suggest there might not be much benefit over just using `switch`. – antak May 26 '13 at 14:53