3

I'm writing a game server, and this might be an easy question, but I just want some clarification.

Why is it that a byte (char or unsigned char) can hold up to a value of 255 (0xFF, which I believe is 2 bytes)? When I use sizeof(unsigned char) the compiler tells me it is 1 byte. Is it because (in ACSII) it is getting "converted" to a character?

Sorry for this poor explaination, I'm not really good at describing a question.

Guus Geurkink
  • 643
  • 2
  • 7
  • 11
  • 2
    `0xFF` is 2 digits, not 2 bytes. One byte is 0..2^8-1. – Linus Kleen May 27 '11 at 08:37
  • 1
    0xFF is 2 nibbles... 1 byte... – forsvarir May 27 '11 at 08:38
  • 3
    You're going to want to read the information about data type size and range [here](http://stackoverflow.com/questions/271076/what-is-the-difference-between-an-int-and-a-long-in-c/271132#271132). `char` could be 8 bits or it could be 64 bits; in any case "byte" is defined to be "char", so a byte could be more than 8 bits. – GManNickG May 27 '11 at 08:40
  • 3
    A byte is not 8 bits. A byte is defined in the standard as the size of a `char`, however many bits that is. ISO (and other standards bodies) tend to use the term octet for an 8-bit value. See http://stackoverflow.com/questions/1535131/potential-problem-with-c-standard-mallocing-chars for more info. – paxdiablo May 27 '11 at 08:50

7 Answers7

33

This touches on a bunch of subjects, including the historical meaning of a byte, the C definition of a char, and mathematics.

For starters, a byte has historically been a lot of things, but nowadays we nearly always mean an octet, which is 8 bits. As a play on words, there's also the nybble (or often nibble) which is half a byte (not called bite).

Mathematics tells us that with an ordered combination of 8 1-or-0 values, we get 28 = 256 combinations. Sometimes we use this unsigned, sometimes signed, but either way we want to have 0 in the range; so the unsigned range is 0..255. For the signed range, we have more options, of which two's complement is the most popular; in that case, we get one more negative value than positive, for a range of -128..+127.

C++ inherits char from C, where it is defined to have a sizeof of 1, to be the smallest addressable size (i.e. having distinct address values with &), and a minimal range of -128..127 or 0..255 depending on if it's signed or not. That boils down to requiring at least 8 bits, or one byte; exactly one byte if the machine supports it.

0xff is another way of writing 255. 0x is the C way of marking a hexadecimal constant, so each digit in it is 4 bits (for 16 possible digits), ergo the nibble. This translates to an unsigned octet with all bits set to 1.

If specific size matters to your code, there is a header stdint.h that defines types of minimal and exact sizes, for speed or size optimization.

Incidentally, ASCII is a 7-bit character set. Machines with 7-bit bytes are unusual nowadays, and wider character sets like ISO 8859-1 and UTF-8 are popular.

Community
  • 1
  • 1
Yann Vernier
  • 15,414
  • 2
  • 28
  • 26
  • I like how you pointed out the distinction that byte has meant a lot of things over time and context, but octet has always meant exactly 8 bits. However, in discussing C and C++ this statement seems to munge those distinctions a little bit: "That boils down to requiring at least 8 bits, or one byte; exactly one byte if the machine supports it." It is true that in C and C++ that sizeof(char) is always 1 and that the limit requirements on char mean CHAR_BIT >= 8. Essentially, in C and C++ a char IS a byte (not necessarily an octet!), but how many bits are in a byte is only partially constrained. – jschultz410 Jan 25 '23 at 18:07
8

0xFF can be stored in 8 bits, which is one byte.

sizeof(char) is defined to always return 1, regardless of the actual size in bits of the underlying datatype (see 5.3.3.1 of the current standard). The sizes of all other dataypes are calculated relative to the size of a char.

Björn Pollex
  • 75,346
  • 28
  • 201
  • 283
5

When I use sizeof(unsigned char) the compiler tells me it is 1 byte.

The size of char [whether it is signed or unsigned ] is always 1 as mandated by the C++ Standard.

Prasoon Saurav
  • 91,295
  • 49
  • 239
  • 345
4

char size is always 1 but number of bits can differ, C define macro CHAR_BIT that have number of bits in char. This mean maximum value that unsigned char can have is pow(2, CHAR_BIT) - 1.

More info there: What is CHAR_BIT?

Yankes
  • 1,958
  • 19
  • 20
  • 1
    An often overlooked edge case, important to be aware of. I lost a pedantic argument with a few people over this one. Importantly, cppreference refers to "CHAR_BIT" as the number of bits in a *byte* – Malachi Jan 26 '20 at 21:01
1

Sizeof char or unsigned char is 1 Byte as per the standard.

Why different ranges if same size?

1 Byte = 8 bits or 2^8
2^8 = 256

Hence,
signed char range is from -128 to 127
unsigned char range is from 0 to 255

This is because in case of signed char one of the bits is used to store the sign, while since unsigned char cannot be -ve, that bit is utlized to increase the range.

Alok Save
  • 202,538
  • 53
  • 430
  • 533
0

255, 0xFF is one byte when represented as an unsigned char. You cannot represent 255 as a signed char.

geofftnz
  • 9,954
  • 2
  • 42
  • 50
0

1 byte is 8 bits so in case of

  • signed : (1 bit is used for sign so 2^7 = 128) it holds from -128 to 127
  • unsigned : (2^8 = 255) it holds from 0 to 255
Ahmed Kotb
  • 6,269
  • 6
  • 33
  • 52