Counting Minimum Number of Bytes Required to Represent a Signed Integer

Question

My task seems simple, I need to calculate the minimum number of bytes required to represent a variable integer (for example if the integer is 5, then I would like to return 1; if the integer is 300, I would like to return 2). Not referring to the data type int which is, as pointed out in comments, always just sizeof(int), I'm referring to a mathematical integer. And I almost have a solution. Here is my code:

int i;
cin >> i;
int length = 0;
while (i != 0) {
    i >>= 8;
    length++;
}

The problem is that this doesn’t work for any negative numbers (I have not been able to determine why) or some positive numbers where the most significant bit is a 0 (because the sign bit is the bit that makes it one bit larger)... Is there any hints or advice I can get in how to account for those cases?

I believe in c++ shifting is undefined on `int` if its negative: https://stackoverflow.com/questions/8415895/is-left-and-right-shifting-negative-integers-defined-behavior — kmdreko, Jan 25 '18 at 00:27
inspecting the int via a char array might be the better option — kmdreko, Jan 25 '18 at 00:30
@vu1p3n0x: I believe right shifting a negative is always defined. Left shift is another story. — Mooing Duck, Jan 25 '18 at 01:00
Start with the definition. How do you define "number of bytes required to store an integer"? Once you have a proper definition, the implementation should naturally follow. Anyway, my guess is, division would work better for this problem than bit shifting, since you are working with numbers and not bit patterns. — Igor Tandetnik, Jan 25 '18 at 01:10
The answer is "int length = sizeof(int);" but I doubt that's what you mean. Define the problem. — Jive Dadson, Jan 25 '18 at 01:24
Count how many bits the absolute value of the number requires. If it was negative, you need 1 more bit. From there you can figure out bytes. — Retired Ninja, Jan 25 '18 at 05:14
The problem as posed makes no sense. You can count the number of bits required to represent a number from a given range. — n. m. could be an AI, Jan 25 '18 at 05:56

score 1 · Answer 1 · answered Jan 25 '18 at 07:12

Stored as a single byte,
Positive numbers are in the range 0x00 to 0x7F
Negative numbers are in the range 0x80 to 0xFF

As 2-bytes,
Positive numbers are in the range 0x0000 to 0x7FFF
Negative numbers are in the range 0x8000 to 0xFFFF

As 4-bytes,
Positive numbers are in the range 0x00000000 to 0x7FFFFFFF
Negative numbers are in the range 0x80000000 to 0xFFFFFFFF

You can use a function like the following to get the minimum size:

int getmin(int64_t i)
{
    if(i == (int8_t)(i & 0xFF))
        return 1;
    if(i == (int16_t)(i & 0xFFFF))
        return 2;
    if(i == (int32_t)(i & 0xFFFFFFFF))
        return 4;
    return 8;
}

Then for example, when you see 0x80, translate it to -128. While 7F is translated to 127, and 0x801 should be translated to a positive number.

Note that this will be very difficult and rather pointless, it should be avoided. This doesn't accommodate storage of numbers in triple bytes, for that, you have to make up your own format.

Ryan Schaefer · Answer 2 · 2018-01-25T07:05:22.390

The range of signed numbers possible to store in x bytes in 2's complement is -2^(8*x-1) to 2^(8*x-1)-1. For example, 1 byte can store signed integers from -128 to 127. Your example would incorrectly calculate that only 1 byte is needed to represent 128 (if we are talking about signed numbers), as right shifting by 8 would equal zero, but that last byte is required to know that this is not a negative number.

For handling negatives, turning it into a positive number and subtracting one (because negative numbers can store an extra value) will allow you to right shift.

int i;
cin >> i;
unsigned bytes = 1;
unsigned max = 128;

if (i < 0) {
    i = ~i; //-1*i - 1
}

while(max <= i) {    
    i >>= 8;
    bytes++;
}
cout << bytes;

Another option is to use __builtin_clz() if you are using gcc. This returns the leading zeros, which you can then use to determine the minimum number of bytes.

You should right-shift `i` instead of left-shifting `max`, to avoid problems with reaching the upper limit of `max`. — Ben Voigt, Jan 25 '18 at 06:11

Counting Minimum Number of Bytes Required to Represent a Signed Integer

2 Answers2