16

This is with reference to the text from C++ Primer Plus by Stephen Prata-

A byte means a 8-bit unit of memory in the sense of unit of measurement that describes the amount of memory in a computer. However, C++ defines byte differently. The C++ byte consists of at least enough adjacent bits to accommodate the basic character set for the implementation.

Can you please explain if a C++ compiler have 16-bit byte whereas the system has 8-bit byte then how will the program run on such system?

  • 14
    `A byte means a 8-bit unit of memory` no, it does not (though it IS 8 bit on almost all platforms), –  Nov 06 '15 at 15:35
  • "**[intro.memory]/1** The fundamental storage unit in the C++ memory model is the *byte*. A byte is at least large enough to contain any member of the basic execution character set (2.3) and the eight-bit code units of the Unicode UTF-8 encoding form and is composed of a contiguous sequence of bits, the number of which is implementation-defined. The least significant bit is called the *low-order bit*; the most significant bit is called the *high-order bit*. The memory available to a C++ program consists of one or more sequences of contiguous bytes. Every byte has a unique address." – Igor Tandetnik Nov 06 '15 at 15:39
  • 3
    It would be nice if someone could come up with an example of an actual C++ implementation on a system with `CHAR_BIT != 8`. – Felix Dombek Nov 06 '15 at 15:44
  • 3
    @FelixDombek http://stackoverflow.com/questions/2098149/what-platforms-have-something-other-than-8-bit-char – Igor Tandetnik Nov 06 '15 at 15:46
  • 6
    It's worth noting that POSIX mandates 8-bit byte. – el.pescado - нет войне Nov 06 '15 at 16:03
  • 4
    Commentary: it's also worth noting that C++ Primer Plus is generally considered to be written by someone who doesn't know what he's talking about. (e.g. https://www.quora.com/What-is-the-difference-between-C++-Primer-Plus-and-C++-Primer-for-beginners) – Lightness Races in Orbit Nov 06 '15 at 16:29
  • 2
    @FelixDombek: Texas Instruments' C55x series of fixed-point digital signal processors have word (16-bit) addressing; they can't address with any smaller granularity. Accordingly, a `char` is 16 bits in size. See [page 7-19 of its C compiler user guide](http://www.ti.com/lit/ug/spru281f/spru281f.pdf). – Jason R Nov 06 '15 at 18:14
  • @FelixDombek: Why? C++ is an abstraction over physical machines. That is its purpose. – Lightness Races in Orbit Nov 06 '15 at 21:35
  • @JasonR I'm getting flashbacks of when I had to process 8-bit byte streams on that horrible platform. – Emile Cormier Nov 06 '15 at 22:59
  • @LightnessRacesinOrbit It is interesting to know if this particular abstraction was ever used. The C55x seems to have a C++ compiler, so it has – but strangely, a `long long` is 40 bytes according to the manual, I wonder if `sizeof` returns 2.5 there? Seems to be deliberately non-standard anyway. – Felix Dombek Nov 07 '15 at 22:39

3 Answers3

16

What the author wants to say about the size of a byte is that, quoting from Wikipedia:

The popularity of major commercial computing architectures has aided in the ubiquitous acceptance of the 8-bit size.

On the other hand, the unit of memory in C++ is given by the built-in type char; under some implementation, a char may not be an 8-bit memory chunk; though, in your C++ program every sizeof(T) will be expressed in multiples of sizeof(char), that is equal to 1 by definition.

The number of bit in a byte for a particular implementation is recorded into the macro CHAR_BIT, defined inside the standard header <climits>. It is guaranteed that char is at least 8-bits.

Finally, this is the definition of byte given by the C++ Standard (§1.7, intro.memory) :

The fundamental storage unit in the C++ memory model is the byte. A byte is at least large enough to contain any member of the basic execution character set (2.3) and the eight-bit code units of the Unicode UTF-8 encoding form and is composed of a contiguous sequence of bits, the number of which is implementationdefined. The least significant bit is called the low-order bit; the most significant bit is called the high-order bit. The memory available to a C++ program consists of one or more sequences of contiguous bytes. Every byte has a unique address.

Deduplicator
  • 44,692
  • 7
  • 66
  • 118
Paolo M
  • 12,403
  • 6
  • 52
  • 73
  • 12
    `in your C++ program every sizeof(T) will be expressed in multiples of sizeof(char)` ... unsurprisingly, since `sizeof(char)==1`, by definition. Any integral number, including `sizeof(T)`, is necessarily a multiple of 1, so the property as stated is not particularly interesting. – Igor Tandetnik Nov 06 '15 at 15:40
  • 3
    @IgorTandetnik You're right, not a big deal. What I wanted to point out was actually that `char` is the unit of measure for other types, for what regards sizes. – Paolo M Nov 06 '15 at 16:17
6

A byte means a 8-bit unit of memory.

That is incorrect.

However, C++ defines byte differently.

That is also incorrect.

In both C++ terminology and general parlance, a byte is the minimum unit of memory. An 8-bit byte is known as an octet.

Can you please explain if a C++ compiler have 16-bit byte whereas the system has 8-bit byte then how will the program run on such system?

It won't. If you compile a program for an architecture whose bytes are 16-bit, it will not run on a computer with an architecture whose bytes are 8-bit.

You have to compile for the processor you're using.

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
  • 2
    That's why the FTP specification talks about octets instead of bytes - when it was being written, not all architectures had standardized on an 8 bit byte. – sashoalm Nov 06 '15 at 15:48
  • 1
    @sashoalm: And they still haven't! – Lightness Races in Orbit Nov 06 '15 at 15:49
  • 3
    I regularly work with modern processors with 16 and 32 bit byte sizes. – Graznarak Nov 06 '15 at 16:19
  • @Graznarak - I think you are talking about words. A byte is the smallest addressable unit of memory. The word is the group of bytes on which of the machine instructions act. The instructions on sixteen bit computer most efficiently work on pairs of bytes. The instructions on a 32 bit computer most efficiently work on two pairs, or four bytes. A word used to correspond to the width of the data bus hardware. – Bob Wakefield Nov 06 '15 at 19:42
  • 2
    @BobWakefield I don't know, but I think Graznarak is talking about processors, where the smallest addressable unit really is 16 or 32 bits. Probably DSPs. – hyde Nov 06 '15 at 21:26
  • 1
    Yes, I am dealing with DSPs. – Graznarak Nov 06 '15 at 21:51
3

There used to be machines that had either variable byte size, or a byte size smaller than 8. The spec leaves it open to implementation on the given hardware.

The DEC PDP-10 had a 36 bit word size, and you could specify the size of a byte (usually 5 7 bit bytes to the word...)

http://pdp10.nocrew.org/docs/instruction-set/Byte.html

Dav3xor
  • 386
  • 2
  • 6