57

We have 8-bit, 16-bit, 32-bit and 64-bit hardware architectures and operating systems. But not, say, 42-bit or 69-bit ones.

Why? Is it something fundamental that makes 2^n bits a better choice, or is just about compatibility with existing systems? (It's obviously convenient that a 64-bit register can hold two 32-bit pointers, or that a 32-bit data unit can hold 4 bytes.)

Jonathan Hall
  • 75,165
  • 16
  • 143
  • 189
Joonas Pulakka
  • 36,252
  • 29
  • 106
  • 169
  • 1
    As you can see from the answers, this situation is a relatively new occurrence. – San Jacinto Oct 22 '09 at 15:14
  • it isn't always a power of 2. [Exotic architectures the standards committees care about](https://stackoverflow.com/q/6971886/995714), [What platforms have something other than 8-bit char?](https://stackoverflow.com/q/2098149/995714) – phuclv Aug 29 '17 at 05:38

18 Answers18

34

That's mostly a matter of tradition. It is not even always true. For example, floating-point units in processors (even contemporary ones) have 80-bits registers. And there's nothing that would force us to have 8-bit bytes instead of 13-bit bytes.

Sometimes this has mathematical reasoning. For example, if you decide to have an N bits byte and want to do integer multiplication you need exactly 2N bits to store the results. Then you also want to add/subtract/multiply those 2N-bits integers and now you need 2N-bits general-purpose registers for storing the addition/subtraction results and 4N-bits registers for storing the multiplication results.

Jonathan Hall
  • 75,165
  • 16
  • 143
  • 189
sharptooth
  • 167,383
  • 100
  • 513
  • 979
  • 3
    Some of our products are based on a TI DSP that has 40 bit longs. http://focus.ti.com/docs/toolsw/folders/print/sprc122.html – Dolphin Oct 22 '09 at 14:24
26

http://en.wikipedia.org/wiki/Word_%28computer_architecture%29#Word_size_choice

Different amounts of memory are used to store data values with different degrees of precision. The commonly used sizes are usually a power of 2 multiple of the unit of address resolution (byte or word). Converting the index of an item in an array into the address of the item then requires only a shift operation rather than a multiplication. In some cases this relationship can also avoid the use of division operations. As a result, most modern computer designs have word sizes (and other operand sizes) that are a power of 2 times the size of a byte.

phuclv
  • 37,963
  • 15
  • 156
  • 475
Voytek Jarnot
  • 469
  • 3
  • 4
21

Partially, it's a matter of addressing. Having N bits of address allows you to address 2^N bits of memory at most, and the designers of hardware prefer to utilize the most of this capability. So, you can use 3 bits to address 8-bit bus etc...

DVK
  • 126,886
  • 32
  • 213
  • 327
  • 2
    This makes the most sense to me. Having a non-power-of-two number of bits would be a waste of address states. – nobody Oct 22 '09 at 13:09
7

The venerable PDP-10 was 36 bits.

Otávio Décio
  • 73,752
  • 17
  • 161
  • 228
  • 6
    I fail to see how pointing out one specific architecture answers the question "why". – Bryan Oakley Oct 22 '09 at 12:33
  • 1
    @Joonas at the time, the character size was 6 bits, hence 36 (6 * 6) made sense... This and other insight are covered in the wikipedia article Voytek Jarnot found for us, in this post. – mjv Oct 22 '09 at 12:35
  • 5
    Most of the PDP's were strange numbers :- http://www.village.org/pdp11/faq.pages/WhatPDP.html It is relevant, because it disproves the initial premise. –  Oct 22 '09 at 12:59
  • The 1953 BESK had 40-bits. http://en.wikipedia.org/wiki/BESK http://www.pbase.com/jakobe/besk_remains – Jonas Elfström Oct 22 '09 at 13:02
  • @JonB perhaps not that strange, they're all multiples of 3, the number of bits in an octal number. – fvu Oct 22 '09 at 13:42
  • @fvu, Yes, I always thought bytes should be 9 bits so octal worked properly. –  Oct 22 '09 at 14:22
  • Actually, on the PDP-10, byte size was variable. A word was 36 bits, and any instructions that accessed bytes included a field to specify the byte size. They considered ASCII to be a 7-bit code then, so you could fit 5 ASCII bytes in a word with one bit left over. Some applications would then use that bit as a flag, for example, the BASIC interpreter used a word to hold the line number, with a maximum of five digits, and then the extra bit was set to 1 to indicate this was a line number. They also used a code they called SIXBIT with (surprise!) six bits per byte, and you could thus ... – Jay Oct 22 '09 at 17:02
  • ... fit six SIXBIT bytes into a word. You could work with binary-coded decimal fairly easily by using 4-bit bytes. Etc. All told I thought that was a pretty cool feature of the machine. – Jay Oct 22 '09 at 17:03
  • 1
    Totally irrelevant to the present question but thinking of the PDP-10: When I first started using that computer we had 300-baud modems. Then one day we got 1200-baud modems. I remember being incredibly impressed with their speed. I commented to a friend, "Wow, this thing can print faster than you can read it!!" – Jay Oct 22 '09 at 17:05
  • @Jay - First time I got impressed by a modem it was a synchronous 9600 baud, huge, all wired circuit that we used to talk to an IBM mainframe... good times :) – Otávio Décio Oct 22 '09 at 17:32
  • @JonB That the PDP series had word sizes in multiples of 3, and that 3 bits is what is needed to hold an octal digit is no accident. Early PDP series computers had front panel switches for direct data entry. The switches were arranged in groups of three to make conversion from octal to switch settings easy (similarly the status lamps displayed bits in groups of threes). – Stephen C. Steel Sep 08 '10 at 22:39
7

Many (most?) early pre-microprocessor CPUs have some number of bits per word that are not a power of two.

In particular, Seymour Cray and his team built many highly influential machines with non-power-of-two word sizes and address sizes -- 12 bit, 48 bit, 60 bit, etc.

A surprisingly large number of early computers had 36-bit words, entirely due to the fact that humans have 10 fingers. The Wikipedia "36-bit" article has more details on the relationship between 10 fingers and 36 bits, and links to articles on many other historically important but no longer popular bit sizes, most of them not a power of two.

I speculate that

(a) 8 bit addressable memory became popular because it was slightly more convenient for storing 7-bit ASCII and 4 bit BCD, without either awkward packing or wasting multiple bits per character; and no other memory width had any great advantage.

(b) As Stephen C. Steel points out, that slight advantage is multiplied by economies of scale and market forces -- more 8-bit-wide memories are used, and so economies of scale make them slightly cheaper, leading to even more 8-bit-wide memories being used in new designs, etc.

(c) Wider bus widths in theory made a CPU faster, but putting the entire CPU on a single chip made it vastly cheaper and perhaps slightly faster than any previous multi-part CPU system of any bus width. At first there were barely enough transistors for a 4 bit CPU, then a 8 bit CPU. Later, there were barely enough transistors for a 16 bit CPU, to a huge fanfare and "16 bit" marketing campaign. Right around the time one would expect a 24 bit CPU ...

(d) the RISC revolution struck. The first two RISC chips were 32 bits, for whatever reason, and people had been conditioned to think that "more bits are better", so every manufacturer jumped on the 32 bit bandwagon. Also, IEEE 754-1985 was standardized with 32-bit and 64-bit floating point numbers. There were some 24 bit CPUs, but most people have never heard of them.

(e) For software compatibility reasons, manufacturers maintained the illusion of a 32-bit databus even on processors with a 64 bit front-side bus (such the Intel Pentium and the AMD K5, etc.) or on motherboards with a 4 bit wide bus (LPC bus).

David Cary
  • 5,250
  • 6
  • 53
  • 66
6

Your memory system wants to be a byte multiple, which makes your cache want to be a byte multiple, which makes your whole system want to be a byte multiple.

Speaking as a HW designer, you generally want to design CPU's to byte boundaries of some kind, ie multiples of 8. Otherwise you either have to add a lot of awkward circuitry to a 49-bit system to make it utilize the mod-8 bits, or you end up ignoring the extra bits, in which case it was a waste, unless you needed the extra bit for instructions, which is never the case on 16 bit or wider systems.

SDGator
  • 2,027
  • 3
  • 21
  • 25
  • 2
    That is just you thinking of 8-bit bytes as fundamental. They are not, systems useing 18, 24, and 36 bit machine words used to be common *and* did not present any problems to the hardware designer. – dmckee --- ex-moderator kitten Oct 22 '09 at 14:05
  • I was referring to two different issues. As long as you have enough bits to cover your instruction set, or machine words, you are fine. Those don't need to be byte multiples. After you've satisfied that requirement, then you need to worry about memory addressing. Usually you access memory in bytes, dwords or owords. If you have a non-byte multiple architecture, you'll need some kind of translator to access memory and caches to grab the extra bits, and the addressing math gets weird. I guess my argument still comes down to convention since you can always define a byte+x addressing scheme. – SDGator Oct 22 '09 at 16:58
  • No. Machines that use not-divisible-by-eight-bits words *don't* and *never have* accessed memory in eight bit bytes. The fact that it is only easy to buy memory that accesses in eight bit bytes is a consequence, not a cause. There is nothing fundamental about eight bit bytes. Nothing. – dmckee --- ex-moderator kitten Oct 22 '09 at 19:39
  • You're right...there's nothing fundamental about 8-bit bytes. You can design anything you want. But there's no fundamental reason a commercial company is going to spend the $$'s to bring a product to market that can't talk normally to peripherals, memory, etc. Its pretty much convention now, and there's no sound technical reason to mess with it. Little endian versus big endian is bad enough. – SDGator Oct 23 '09 at 04:33
4

At one time, computer word lengths tended to be a multiple of 6 bits, because computers typically used 6-bit character sets, without support for lower-case letters.

IBM made a high-performance computer, the STRETCH, for Los Alamos, which had a 64-bit word. It had the unusual feature that individual bits in the computer's memory could be directly addressed, which forced the word length to be a power of two. It also had a more extended character set, which allowed mathematical symbols (in addition to lower case) to be included; they were used in a special higher-level language named COLASL.

When IBM came out with the very popular System/360 mainframe, even though it did not have bit addressing, it kept the eight-bit byte, primarily to allow efficient storage of packed decimal quantities at four bits to the decimal digit. Because that machine was so popular, it was very influential, and the PDP-11 computer from DEC was designed with a 16-bit word and 8-bit characters. The PDP-11 was also the first true little-endian machine, and it was also very popular and influential.

But it isn't just because of following fashion. 8-bit characters allow lower-case text, and as computers became cheaper, being able to easily use them for word processing was valued. And just as the STRETCH needed to have a word that had a power of two size in bits to allow bits to be easily addressed, today's computers needed to have a word that was a power-of-two multiple of 8 (which happens to be two to the third power itself) to allow characters to be easily addressed.

If we still used 6 bit characters, computers would tend to have 24, 48, or 96 bit words.

3

As others have pointed out, in the early days, things weren't so clear cut: words came in all sorts of oddball sizes.

But the push to standardize on 8bit bytes was also driven by memory chip technology. In the early days, many memory chips were organized as 1bit per address. Memory for n-bit words was constructed by using memory chips in groups of n (with corresponding address lines tied together, and each chips single data bit contributing to one bit of the n-bit word).

As memory chip densities got higher, manufacturers packed multiple chips in a single package. Because the most popular word sizes in use were multiples of 8 bits, 8-bit memory was particularly popular: this meant it was also the cheapest. As more and more architectures jumped onto the 8 bit byte bandwagon, the price premium for memory chips that didn't use 8 bit bytes got bigger and bigger. Similar arguments account for moves from 8->16, 16->32, 32->64.

You can still design a system with 24 bit memory, but that memory will probably be much more expensive than a similar design using 32 bit memory. Unless there is a really good reason to stick at 24 bits, most designers would opt for 32 bits when its both cheaper and more capable.

Stephen C. Steel
  • 4,380
  • 1
  • 20
  • 21
1

Related, but possibly not the reason, I heard that the convention of 8 bits in a byte is because it's how IBM rigged up the IBM System/360 architecture.

Joel
  • 29,538
  • 35
  • 110
  • 138
  • 1
    Really, it comes down to how easy is the conversion from binary to hex, and the smallest useful microcontroller size. A nibble (4 bits) converts very easy to a single hex digit (0-F). But that only give you 15 instructions. A byte gives you 255 possible instruction while still being easy to convert to hex in your head. – SDGator Oct 22 '09 at 12:36
  • A nibble! Not heard that before. – Joel Oct 22 '09 at 12:56
  • @SDGator: on the old 18, 24, and 36 bit architectures, people used octal instead of hex because *that* fit evenly (which is why c supports decimal, hex, and octal integer expressions). You are mistaking convention for something fundamental. – dmckee --- ex-moderator kitten Oct 22 '09 at 14:06
  • 1
    My guess is that it is due to binary coded decimal (BCD), i.e. two decimal digits in a byte. Bean counters love decimal numbers, it avoids rounding problems for money. – starblue Oct 22 '09 at 17:13
  • @starblue: There may be something to that notion. – dmckee --- ex-moderator kitten Oct 22 '09 at 19:39
1

A common reason is that you can number your bits in binary. This comes in useful in quite a few situations. For instance, in bitshift or rotate operations. You can rotate a 16 bit value over 0 to 15 bits. An attempt to rotate over 16 bits is also trivial: that's equivalent to a rotation over 0 bits. And a rotation over 1027 bits is equal to a rotation over 3 bits. In general, a rotation of a register of width W over N bits equals a rotation over N modulo W, and the operation "modulo W" is trivial when W is a power of 2.

MSalters
  • 173,980
  • 10
  • 155
  • 350
1

The 80186, 8086, 8088 and "Real Mode" on 80286 and later processors used a 20-bit segmented memory addressing system. The 80286 had 24 native address lines and then the 386 and later had either 32 or 64.

Jesse C. Slicer
  • 19,901
  • 3
  • 68
  • 87
  • 1
    friggin' near and far pointers. what a gross way to manage memory. – San Jacinto Oct 22 '09 at 15:12
  • The near/far thing stunk, but given the available hardware designs and historical constraints, choices were limited. – DaveE Oct 22 '09 at 21:31
  • The fact that Intel wanted backward compatibility AT ALL COSTS was, I think, too strict of a constraint. This is what allowed Motorola and Power PC to swoop in with superior, yet incompatible, designs. Mac only switched to Intel once it had an architecture/instruction set that it deemed robust enough to base their computer on. Now, mind you, this is from a technical perspective. From a business perspective, I think they made the right moves to keep their market share significant. – Jesse C. Slicer Oct 23 '09 at 13:54
  • @JesseC.Slicer Source for Apple switching because Intel had a sufficiently robust instruction set? I was under the impression that they had no choice, IBM wasn't really going anywhere with PowerPC in the desktop/laptop area (hence no Powerbook G5), and x86 was the only other architecture that came in appropriate chips. The first few x86 Macs were still 32-bit, so they didn't have any of the 64-bit ISA improvements. – 8bittree Aug 11 '16 at 20:40
1

Another counter example: the PIC16C8X series microcontrollers have a 14 bit wide instruction set.

pjc50
  • 1,856
  • 16
  • 18
  • you beat me to it by seconds! it's worth mentioning that this is a harvard architecture processor and the 14 bit words are for instructions, whereas data memory is a standard 8 bit byte. – rmeador Oct 22 '09 at 15:24
1

Byte is related to encoding of characters mostly of the western world hence 8 bit. Word is not related to encoding it related to width of address hence it is varied from 4 to 80 etc etc

  • As this is a popular question, Perhaps it would be relevant for you to review [how to write a good answer](http://stackoverflow.com/help/how-to-answer). Please add some references and broaden your explanation to the point where it is superior to the existing answers. – Quintin Balsdon Nov 18 '16 at 14:42
  • Western languages are covered with 8 bits (say iso 8859-1, to 15 or so. Even CJK encoded with two 8 bits i.e., two bytes for enciding (iso 2202). Whereas width I'd word is reffered as number of bytes for convenience. UTF-16, 32 are of 16 and 32 bits are terned as 2 bytes and 4 bytes. It is all for convenience if understanding, since byte has become familier with encoding. – user7178611 Nov 18 '16 at 15:28
0

My trusty old HP 32S calculator was 12-bits.

ndim
  • 35,870
  • 12
  • 47
  • 57
0

Because the space reserved for the address is always a fixed number of bits. Once you have defined the fixed address (or pointer) size then you want to make the best of it, so you have to use all its values until the highest number it can store. The highest number you can get from a multiple of a bit (0 either 1) is always a power of two

0

May be you can find something out here: Binary_numeral_system

Shaoshing
  • 95
  • 5
0

The ICL 1900 were all 24 bit (words). Bet there's not a lot of people remember these. You do ??

Joshua
  • 40,822
  • 8
  • 72
  • 132
Frank
  • 1
0

We have, just look at PIC microcontrollers.

Szundi
  • 307
  • 2
  • 10