Detection of negative integers using bit operations

Question

One approach to check if a given integer is negative or not, could be this: (using bit operations)

int num_bits = sizeof(int) * 8; //assuming 8 bits per byte!
int sign_bit = given_int & (1 << (num_bits-1)); //sign_bit is either 1 or 0
if ( sign_bit )
{
     cout << "given integer is negative"<<endl;
}
else
{
     cout << "given integer is positive"<<endl;
}

The problem with this solution is that number of bits per byte couldn't be 8, it could be 9,10, 11 even 16 or 40 bits per byte. Byte doesn't necessarily mean 8 bits! Anyway, this problem can be easily fixed by writing,

//CHAR_BIT is defined in limits.h
int num_bits = sizeof(int) * CHAR_BIT; //no assumption.

It seems fine now. But is it really? Is this Standard conformant? What if the negative integer is not represented as 2's complement? What if it's representation in a binary numeration system that doesn't necessitate only negative integers to have 1 in it's most significant bit?

Can we write such code that will be both portable and standard conformant?

Related topics:
Size of Primitive data types
Why is a boolean 1 byte and not 1 bit of size?

@Chris : nothing wrong. just that I wanted to do it "using bit operations". — Nawaz, Jan 15 '11 at 06:31
@Nawaz - I'll be interested if you find something faster than the `<` operator. I'd imagine most compilers will implement the `<` operator as whatever you end up finding, though. But cheers. — Chris Lutz, Jan 15 '11 at 06:33
This is just wrong. The standard does not specify an integer representation. So you are assuming the top bit will help identify negative numbers. PS. Glad you found CHAR_BITS. — Martin York, Jan 15 '11 at 16:37
@Chris: I'm not claiming to be faster. I just wanted to do it "using bit operations" just out of curiosity. – Nawaz 10 hours ago — Nawaz, Jan 15 '11 at 16:58
@Martin : exactly, my point. the standard doesn't specify integer representation! — Nawaz, Jan 15 '11 at 17:00

score 4 · Answer 1 · edited May 23 '17 at 11:47

4

NOTE: C and C++ are different languages. Their respective standards have evolved somewhat independently and they have differing formal restrictions on numeric representation.

Can we write such code that will be both portable and standard conformant?

Assuming that you require a general method of identifying and interpreting a sign bit, I think the answer to your question is No.

Regarding C++: I don't think that the standard has an explicit requirement for the existence of a sign bit. Even if every implementation uses a sign bit, there is no guarantee that it is the first (high-order) bit as your code assumes. Also, the bit could have the opposite interpretation from what you are assuming (a "1" value for the sign bit could mean that the number is positive).

Regarding C99: The language does require a sign bit and requires that sign=1 indicates a negative number (although it might be "negative zero"). However, the language standard does not give you a general way of determining where the sign bit is.

The following code attempts to create a "sign_mask" in a generic way, but it is not absolutely guaranteed to work in C99 or C++. Reasons for failure include those mentioned above, but most interestingly, it could invoke a "trap representation" (such as a parity bit error)...

#ifdef __cplusplus
   #define INT_MAX std::numeric_limits<int>::max()
   #define UINT_MAX std::numeric_limits<unsigned int>::max() 
#endif

// assumes sign bit becomes high-order bit of unsigned
   int sign_mask = INT_MAX ^ UINT_MAX; 

// fallback in case unsigned type doesn't take advantage of the sign-bit
// This might invoke a "trap representation" on some platforms
   if (sign_mask==0) sign_mask = ~INT_MAX;

This Wikipedia article discusses some different ways that signed numbers can be represented in binary: Signed number representations

Here's an informative post about signed numbers in the C99 standard.

edited May 23 '17 at 11:47

Community

1
1

answered Jan 15 '11 at 07:19

Brent Bradburn

51,587
17
154
173

@nober : How could you say `~std::numeric_limits::max();` must return *sign mask*? – Nawaz Jan 15 '11 at 07:38
The result will have all bits set that are not used to represent an unsigned number -- in other words, just the sign bit will be set. This assumes that the max value is one less than a power of 2, which is usually the case, but it is probably not guaranteed by the standard. – Brent Bradburn Jan 15 '11 at 07:52
My point here is that this should solve your problem about the size of a byte -- but it doesn't solve any of the other hypothetical concerns that I mentioned. – Brent Bradburn Jan 15 '11 at 07:56
1

I'm not sure that `given_int & sign_mask` can't be a trap representation due to padding. – AProgrammer Jan 15 '11 at 08:41
1

C **does** have an explicit requirement for the existence of a sign bit. – R.. GitHub STOP HELPING ICE Jan 15 '11 at 13:10
Sign bit cannot have opposite interpretation either. C defines that the all-zero-bits pattern is numeric zero for all integer types, and only allows 3 choices of representation for signed values: sign/magnitude, ones complement, or twos complement. – R.. GitHub STOP HELPING ICE Jan 15 '11 at 18:25
@AProgrammer: Thanks for the comment about "trap representation". I wasn't previously aware of this terminology from the C99 standard. – Brent Bradburn Jan 15 '11 at 23:17
1

I wouldn't say that the C++ and C standard evolved independently as there are strong cross-influence. – AProgrammer Jan 16 '11 at 08:54

score 2 · Answer 2 · answered Jan 15 '11 at 06:27

2

Cast the integer to the corresponding unsigned type and then you don't have to worry about what signed representation is in use. The only remaining issue is that there may be padding bits. Here's a solution with no bit shifting and thus no dependence on width in bits matching the size in bits:

#define IS_NEG(x) ((unsigned_type)x & (unsigned_type)-1-(unsigned_type)-1/2)

answered Jan 15 '11 at 06:27

R.. GitHub STOP HELPING ICE

208,859
35
376
711

2

If you use `uintmax_t` in your casts you won't have to worry about the original type of `x`. – Chris Lutz Jan 15 '11 at 06:32
@Chris: quite true. Hopefully the optimizer will avoid actually expanding to a lot more bits, but I'm not sure... – R.. GitHub STOP HELPING ICE Jan 15 '11 at 06:37
@R : explanation of this solution? Why should it be portable and standard conformant? – Nawaz Jan 15 '11 at 06:42
1

Conversion to an unsigned type is defined as reduction modulo one plus the maximum value of the destination type, aka conversion to twos complement. This applies to the constant `-1`'s as well, giving a result with just the highest bit set after substraction. – R.. GitHub STOP HELPING ICE Jan 15 '11 at 06:58
1

@R.. - I didn't know that before, but I looked it up, and the standard does say that converting to an unsigned type ends up converting to the two's complement representation. Does that mean we can do `const int x = -1; if((unsigned)x == *(unsigned *)&x) { /* two's complement */ } else { /* other representation */ }` (ignoring the obvious part where `*(unsigned *)&x` is undefined behavior)? – Chris Lutz Jan 15 '11 at 07:55
3

The range of unsigned integer may be the same as the one of signed integers, in which case the macro will give the same result for -1 and INT_MAX. – AProgrammer Jan 15 '11 at 08:21
@AProgrammer, you are right I think. R's solution doesn't capture the case that signed and unsigned type have the same number of value bits. Here `(unsigned_type)x` will clash the sign bit which is outside the scope of the value bits of the unsigned type. – Jens Gustedt Jan 15 '11 at 08:35
@Chris Lutz: You can test for 2s complement just with `if (-1&2)` - both sign-magnitude and 1s complement will return 0 for that. – caf Jan 15 '11 at 10:49
@caf - Won't one's complement return 2 for that? -1 is `11111110` and 2 is `00000010`, returning `00000010` by my count. – Chris Lutz Jan 15 '11 at 10:54
@Chris Lutz: Err, yes - I originally had `if (-1&1 && -1&2)`, then changed it for some reason. `if ((-1&3) == 3)` should also work. – caf Jan 15 '11 at 11:13

score 0 · Answer 3 · answered Jan 15 '11 at 06:29

Use something like bit wrapping using circular shift so you can get the final bit under your thumb at the start of the bit string, and then bool neg = n & 1; it or something. Here's some code for bit wrapping:

template <typename T>
inline T rotate_left(T val, unsigned char shift=1)
{
    static const bits = sizeof(T) * CHAR_BIT;
    return (val >> (bits-shift)) | (val << shift);
}

template <typename T>
inline T rotate_right(T val, unsigned char shift=1)
{
    static const bits = sizeof(T) * CHAR_BIT;
    return (val << (bits-shift)) | (val >> shift);
}

// And now for some platform-dependant specializations...

#include <intrin.h>

template<>
inline unsigned char rotate_left(unsigned char val, unsigned char shift=1)
{
    return _rotl8(val, shift);
}

template<>
inline unsigned char rotate_right(unsigned char val, unsigned char shift=1)
{
    return _rotr8(val, shift);
}

template<>
inline unsigned int rotate_left(unsigned int val, unsigned char shift=1)
{
    return _rotl(val, shift);
}

template<>
inline unsigned int rotate_right(unsigned int val, unsigned char shift=1)
{
    return _rotr(val, shift);
}

template<>
inline unsigned long long rotate_left(unsigned long long val, unsigned char shift=1)
{
    return _rotl64(val, shift);
}

template<>
inline unsigned long long rotate_right(unsigned long long val, unsigned char shift=1)
{
    return _rotr64(val, shift);
}

I don't see any discussion in your reply. Is it standard conformant and portable? — Nawaz, Jan 15 '11 at 06:34
Seems like it would be. I don't see any issues regarding this. — Chris Dennett, Jan 15 '11 at 06:36
As noted above: rotate_left(n) and then boolean x = n & 1. If n == 1, it's negative. The template infers the type from the given value, and the default shift is 1. It works for all primitive types. — Chris Dennett, Jan 15 '11 at 07:13
Right shifting signed value is implementation defined. The most common definition will make your definition of rotate left not work -- but that is fixable. Sadly defining shift right for signed and magnitude to touch only the magnitude is an existing practice, and that will prevent fixing your method. Converting to unsigned will not work (unsigned may have only the same range as positive integer). And I've not tried to think about the effect of padding inside integers. — AProgrammer, Jan 15 '11 at 08:36

score 0 · Answer 4 · answered Jan 15 '11 at 08:53

Expanding on R.s method:

template <typename T> is_negative(T v) {
    boost::make_unsigned<T>::type u(v);
    return (u & (1 << std::numeric_limits<T>::digits - 1));
}

If you don't like the usage of boost (make_unsigned is in type_traits), just use your platform's biggest unsigned int type.

Detection of negative integers using bit operations

4 Answers4