23

Why not 4 bits, or 16 bits?

I assume some hardware-related reasons and I'd like to know how 8bit 1byte became the standard.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
aerin
  • 20,607
  • 28
  • 102
  • 140
  • Related on retrocomputing.SE: [What was the rationale behind 32-bit computer architectures?](https://retrocomputing.stackexchange.com/q/19657) / [Last computer not to use octets / 8-bit bytes](https://retrocomputing.stackexchange.com/q/7937) / [Did any computer use a 7-bit byte?](https://retrocomputing.stackexchange.com/q/15512) / [What was the rationale behind 36 bit computer architectures?](https://retrocomputing.stackexchange.com/q/11801) . For some, packing characters into words is a factor, how to store strings. (e.g. 3x 6-bit characters in a machine with 18-bit words). – Peter Cordes Jul 28 '22 at 21:50
  • Also related: [What is the hex versus octal timeline?](https://retrocomputing.stackexchange.com/q/12101) - octal was useful on machines where the word size was a multiple of 3, like 18 or 36 bits. – Peter Cordes Jul 28 '22 at 21:53
  • Stack Overflow has a [tag:history] tag, but the usage guidance is *don't* use it: history of programming or computing questions are off topic. It's too late to migrate this now, but similar questions belong on http://retrocomputing.stackexchange.com/ – Peter Cordes Jul 28 '22 at 22:16

1 Answers1

21

I'ts been a minute since I took computer organization, but the relevant wiki on 'Byte' gives some context.

The byte was originally the smallest number of bits that could hold a single character (I assume standard ASCII). We still use ASCII standard, so 8 bits per character is still relevant. This sentence, for instance, is 41 bytes. That's easily countable and practical for our purposes.

If we had only 4 bits, there would only be 16 (2^4) possible characters, unless we used 2 bytes to represent a single character, which is more inefficient computationally. If we had 16 bits in a byte, we would have a whole lot more 'dead space' in our instruction set, we would allow 65,536 (2^16) possible characters, which would make computers run less efficiently when performing byte-level instructions, especially since our character set is much smaller.

Additionally, a byte can represent 2 nibbles. Each nibble is 4 bits, which is the smallest number of bits that can encode any numeric digit from 0 to 9 (10 different digits).

Bango
  • 971
  • 6
  • 18
  • 15
    Correction, ASCII uses 7 bits. – Bango Mar 16 '17 at 19:10
  • 1
    Except "this sentence" isn't encoded in ASCII. It's encoded in UTF-8. ASCII has very limited and specialized usages. UTF-8 is an encoding for the Unicode character set. All text in HTML, XML, … is Unicode. See the HTTP response header for this page to see that the web server encoded it in UTF-8. (Hit F12, then F5, then select the request name 42842817.) If you consult the HTTP specification, you'll find that the HTTP headers are in fact ASCII. So we do use ASCII every day but we hardly ever use in new progams. – Tom Blodget Mar 17 '17 at 01:20
  • 3
    Is that why they call it UTF-8? Because its Using The Full 8 bit byte? haha – Bango Mar 17 '17 at 02:53
  • 2
    No. It's called UTF-8 because the code unit is 8 bits. Each code unit provides some of the bits needed for the 21-bit Unicode codepoint. A codepoint requires 1 to 4 UTF-8 code units. Similarly for UTF-16 and UTF-32. However, by design, a codepoint would never need more than one UTF-32 code unit. – Tom Blodget Mar 17 '17 at 16:35
  • 1
    @Tom Blodget You are technically right that the encoding is UTF-8. But that's meaningless in this context because UTF-8 is a superset of ASCII. – Pradeep Gollakota Jun 13 '20 at 00:35
  • and some thing else... why not 7 or 9 bit ? Because the number 8 is one of the powers of the number 2 – Farshid Ahmadi Nov 11 '20 at 22:40
  • ASCII uses 7 bits.. Why 8 bit is used instead of 7 bit? – Amrit Prasad Mar 24 '21 at 13:11
  • 1
    @Hitman https://stackoverflow.com/questions/14690159/is-ascii-code-7-bit-or-8-bit – Bango Mar 28 '21 at 23:47
  • 1
    ASCII is a 7-bit code, representing 128 different characters. When an ASCII character is stored in a byte the most significant bit is always zero. Sometimes the extra bit is used to indicate that the byte is not an ASCII character, but is a graphics symbol, however this is not defined by ASCII. – Jerry An Jul 20 '21 at 09:23