5

I'm working on a small networking application that uses byte arrays. Traditionally these would be declared with something like char buf[] = ....

This seems to be how it is (still?) done in most tutorials, yet it has the problem that it may obscure what is actually happening, for example when you try to print such an array and forget that not every char is a visible character.

Some people have suggested that you should stop using chars altogether and instead use the modern uint8_t. I find that very appealing, mostly on the principle that explicit is better than implicit.

So, is there something wrong with declaring these kinds of arrays as uint8_t buf[] = ...?

gmolau
  • 2,815
  • 1
  • 22
  • 45
  • 1
    To me it seems nothing is wrong with neither. A char is defined as 8 bits (1 byte) for all platforms, just as `uint8_t`. The only difference/problem with un- or signed is the sign extension when casting or converting to int. – Paul Ogilvie Dec 13 '17 at 16:21
  • 1
    @PaulOgilvie - That (`char` is 8 bits) is not guaranteed. Overwhelmingly common, yes, but not guaranteed. – Oliver Charlesworth Dec 13 '17 at 16:24
  • If char is not guaranteed 8 bits (thanks @OliverCharlesworth) then char must be used where char is required and `uint8_t` where bytes are requried (the latter e.g. on a network interface, the former e.g. on character processing). – Paul Ogilvie Dec 13 '17 at 16:28
  • 3
    @OliverCharlesworth: But `char` is the "smallest addressable unit" and has at least 8 bits. So *if* there is a `uint8_t` then both have the same size. – (Btw, POSIX requires that CHAR_BIT == 8.) – Martin R Dec 13 '17 at 16:28
  • If network code depends on`char`s signedness (or size, or padding) , it is bad code anyway. IMHO. – wildplasser Dec 13 '17 at 16:31
  • @wildplasser network protocols are defined in terms of octets; I think it is reasonable for code to depend on `char` being 8-bit – M.M Dec 13 '17 at 16:38
  • Consider looking at: https://stackoverflow.com/questions/9727465/will-a-char-always-always-always-have-8-bits – babon Dec 13 '17 at 16:38
  • @M.M : exactly. But the wording should be : char being *at least* 8bits wide. – wildplasser Dec 13 '17 at 16:40
  • Ok, so I guess I should be reading more comprehensive tutorials then:) Thanks everyone! – gmolau Dec 13 '17 at 16:50
  • `uint8_t` is an optional type. I don't think it can exist if `unsigned char` is not exactly 8 bits wide, because `unsigned char` is the smallest addressable unit, and both `uint8_t` (if it exists) and `unsigned char` have no padding bits. – Ian Abbott Dec 13 '17 at 18:11

1 Answers1

5

Starting with the introduction of stdint.h header in C99, there is no good reason to continue using char type to represent small numbers.

In addition to documenting your intentions better, uint8_t gives you one more important advantage: it guarantees that the byte is going to be treated as unsigned. When you use char you cannot assume if it is signed or unsigned, because this behavior is implementation-defined.

As far as inadvertent printing of the buffer goes, using uint8_t is not going to guarantee any protection, because on many platforms it's simply a typedef for unsigned char.

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
  • 1
    `uint8_t` does have the theoretical problem that it might not be a typedef for a character type, and therefore cannot be used for aliasing other types ... I expect that to be fixed at some point in the future though – M.M Dec 13 '17 at 16:38