How numbers are stored?

Question

What kind of method does the compiler use to store numbers? One char is 0-255. Two chars side by side are 0-255255. In c++ one short is 2 bytes. The size is 0-65535. Now, how does the compiler convert 255255 to 65535 and what happens with the - in the unsigned numbers?

You just did something a bit like saying `123 + 456 = 123456`, which is clearly wrong. You don't just concatenate numbers when you're doing maths. — Joseph Mansfield, Jan 16 '14 at 10:55
if I store the number 7665 on a short, what's the 1st byte and what's the 2nd? — Luka, Jan 16 '14 at 10:56
It depends on the system. On [little-endian platforms](http://en.wikipedia.org/wiki/Endianness), the first byte is 241 and the second byte 29. On big-endian platforms it's the other way around. — user4815162342, Jan 16 '14 at 10:57
ObSocraticMethod: If 0..255 has a range of 256 numbers, how many numbers can 256 ranges of 256 numbers hold? I.e. what is 256*256? — Kaz Dragon, Jan 16 '14 at 10:57

Joseph Mansfield · Answer 1 · 2014-01-16T11:36:33.223

4

The maximum value you can store in n bits (when the lowest value is 0 and the values represented are a continuous range), is 2ⁿ − 1. For 8 bits, this gives 255. For 16 bits, this gives 65535.

Your mistake is thinking that you can just concatenate 255 with 255 to get the maximum value in two chars - this is definitely wrong. Instead, to get from the range of 8 bits, which is 256, to the range of 16 bits, you would do 256 × 256 = 65536. Since our first value is 0, the maximum value is 65535, once again.

Note that a char is only guaranteed to have at least 8 bits and a short at least 16 bits (and must be at least as large as a char).

edited Jan 16 '14 at 11:36

answered Jan 16 '14 at 11:01

Joseph Mansfield

108,238
20
242
324

`char` is _not_ guaranteed to have 8 bits. I've seen 9 bit `char`, and I've heard that some modern machines have 16 or 32 bit `char`. – James Kanze Jan 16 '14 at 11:34
@JamesKanze For some reason I thought `CHAR_BIT` must be exactly 8, even though its in the same "equal or greater" list as `USHRT_MAX`. Thanks. – Joseph Mansfield Jan 16 '14 at 11:37
There's also no guarantee (except for `unsigned char`) that all of the bits participate in the value. On at least one machine (still in production), `int` is 48 bits, but only 40 are used. – James Kanze Jan 16 '14 at 11:37
How do you think they implemented C and C++ on 36 bit machines. (I think the last 36 bit machine went out of production a few years ago, but they used to be quite common.) – James Kanze Jan 16 '14 at 11:38
@JamesKanze But there must be at least enough bits participating in the value to meet the minimum maximum value requirements. – Joseph Mansfield Jan 16 '14 at 11:38
Of course. The only case I actually know where not all of the bits participate is a 48 bit tagged architecture, where `int` and `double` have the same representation, with the exponent field being 0 implying `int`. (Overflow of integral arithmetic results in a value that will be treated as a float, even if it is assigned to an `int` variable.) – James Kanze Jan 16 '14 at 12:44

score 2 · Answer 2 · answered Jan 16 '14 at 11:06

2

When using decimal system it's true that range of one digit is 0-9 and range of two digits is 0-99. When using hexadecimal system the same thing applies, but you have to do the math in hexadecimal system. Range of one hexadecimal digit is 0-Fh, and range of two hexadecimal digits (one byte) is 0-FFh. Range of two bytes is 0-FFFFh, and this translates to 0-65535 in decimal system.

answered Jan 16 '14 at 11:06

Dialecticus

16,400
7
43
103

+1 For catching what they were probably thinking. – Joseph Mansfield Jan 16 '14 at 11:09

score 2 · Answer 3 · answered Jan 16 '14 at 11:15

Decimal is a base-10 number system. This means that each successive digit from right-to-left represents an incremental power of 10. For example, 123 is 3 + (2*10) + (1*100). You may not think of it in these terms in day-to-day life, but that's how it works.

Now you take the same concept from decimal (base-10) to binary (base-2) and now each successive digit is a power of 2, rather than 10. So 1100 is 0 + (0*2) + (1*4) + (1*8).

Now let's take an 8-bit number (char); there are 8 digits in this number so the maximum value is 255 (2**8 - 1), or another way, 11111111 == 1 + (1*2) + (1*4) + (1*8) + (1*16) + (1*32) + (1*64) + (1*128).

When there are another 8 bits available to make a 16-bit value, you just continue counting powers of 2; you don't just "stick" the two 255s together to make 255255. So the maximum value is 65535, or another way, 1111111111111111 == 1 + (1*2) + (1*4) + (1*8) + (1*16) + (1*32) + (1*64) + (1*128) + (1*256) + (1*512) + (1*1024) + (1*2048) + (1*4096) + (1*8192) + (1*16384) + (1*32768).

score 2 · Accepted Answer · edited May 23 '17 at 10:25

You have got the math totally wrong. Here's how it really is

since each bit can only take on either of two states only(1 and 0) , n bits as a whole can represents 2^n different quantities not numbers. When dealing with integers a standard short integer size of 2 bytes can represent 2^n - 1 (n=16 so 65535)which are mapped to decimal numbers in real life for compuational simplicity.
When dealing with 2 character they are two seperate entities (string is an array of characters). There are many ways to read the two characters on a whole, if you read is at as a string then it would be same as two seperate characters side by side. let me give you an example :
remember i will be using hexadecimal notation for simplicity!
if you have doubt mapping ASCII characters to hex take a look at this ascii to hex

for simplicity let us assume the characters stored in two adjacent positions are both A.
Now hex code for A is 0x41 so the memory would look like

1 byte ....... 2nd byte
01000100 01000001

if you were to read this from the memory as a string and print it out then the output would be
AA

if you were to read the whole 2 bytes as an integer then this would represent

0 * 2^15 + 1 * 2^14 + 0 * 2^13 + 0 * 2^12 + 0 * 2^11 + 1 * 2^10 + 0 * 2^9 + 0 * 2^8 + 0 * 2^7 + 1 * 2^6 + 0 * 2^5 + 0 * 2^4 + 0 * 2^3 + 0 * 2^2 + 0 * 2^1 + 1 * 2^0

= 17537

if unsigned integers were used then the 2 bytes of data would me mapped to integers between 0 and 65535 but if the same represented a signed value then then , though the range remains the same the biggest positive number that can be represented would be 32767. the values would lie between -32768 and 32767 this is because all of the 2 bytes cannot be used and the highest order bit is left to determine the sign. 1 represents negative and 2 represents positive.

You must also note that type conversion (two characters read as single integer) might not always give you the desired results , especially when you narrow the range. (example a doble precision float is converted to an integer.)

For more on that see this answer double to int
hope this helps.

score 1 · Answer 5 · answered Jan 16 '14 at 11:33

It depends on the type: integral types must be stored as binary (or at least, appear so to a C++ program), so you have one bit per binary digit. With very few exceptions, all of the bits are significant (although this is not required, and there is at least one machine where there are extra bits in an int). On a typical machine, char will be 8 bits, and if it isn't signed, can store values in the range [0,2^8); in other words, between 0 and 255 inclusive. unsigned short will be 16 bits (range [0,2^16)), unsigned int 32 bits (range [0,2^32)) and unsigned long either 32 or 64 bits.

For the signed values, you'll have to use at least one of the bits for the sign, reducing the maximum positive value. The exact representation of negative values can vary, but in most machines, it will be 2's complement, so the ranges will be signed char:[-2^7,2^7-1)`, etc.

If you're not familiar with base two, I'd suggest you learn it very quickly; it's fundamental to how all modern machines store numbers. You should find out about 2's complement as well: the usual human representation is called sign plus magnitude, and is very rare in computers.

How numbers are stored?

5 Answers5