18

I know both are different types (signed char and char), however my company coding guidelines specifies to use int8_t instead of char.

So, I want to know, why I have to use int8_t instead of char type. Is there any best practices to use int8_t?

brian beuning
  • 2,836
  • 18
  • 22
sokid
  • 813
  • 3
  • 10
  • 16
  • 14
    If I remember correctly, `char` is not guaranteed to be exactly eight bits, it just happens to be that on most modern systems. – Some programmer dude Jul 19 '13 at 10:47
  • 1
    int8_t is platform independent, see stdint.h file in your compiler. – Ishmeet Jul 19 '13 at 10:49
  • 5
    @JoachimPileborg guaranty for `char` is one byte length. Thus count of bits in `char` depends of count of bits in byte for particular system. – triclosan Jul 19 '13 at 10:50
  • 1
    the use of `int8_t` guarantees a variable of size eight bits for any platform. – suspectus Jul 19 '13 at 10:52
  • 4
    @suspectus: Pedantically, for any platform *on which it's defined*. There are platforms that don't support 8-bit types. – Mike Seymour Jul 19 '13 at 10:59
  • 3
    Shouldn't you be asking other people at your company *what the rationale is* behind those guidelines? – jalf Jul 19 '13 at 11:01
  • 1
    `char` is the platform's memory and I/O type. `int8_t` is an arithetic, 8-bit signed integer. Those are often similar or even the same, but the *intention* is different. – Kerrek SB Jul 19 '13 at 11:12
  • `int8_t` also guarantees 2's complement. – James Kanze Jul 19 '13 at 11:31
  • @jalf Yes. It sounds very dubious (but it could be justified in some embedded systems by the need to match external hardware to which the system interfaces). – James Kanze Jul 19 '13 at 11:44
  • Sokid: if I use `int8_t` then I will prefer `PRIu8` format string, and for same `PRIu16`, `PRIu32`. – Grijesh Chauhan Jul 19 '13 at 11:48
  • FWIW: the "usual" convention that I've seen (in many shops) is `char` for character data only, `signed char` when you need to store small integral values, and space is critical, and `unsigned char` for raw memory and all sorts of bit bashing types of things. (Arguably, `unsigned char` would be better for UTF-8 as well: you don't want any negative values, and when you access at the byte level, you'll typically be using some bit manipulations. But of course, `std::string` is `char`, not `unsigned char`, so you're stuck.) – James Kanze Jul 19 '13 at 11:48
  • @JamesKanze that's not quite what I meant. Just that he might as well go straight to the source. They instituted this rule, and presumably they had reasons. To learn what those reasons were, why not just ask them? – jalf Jul 19 '13 at 12:22
  • @jalf I agree (and with Mats Petersson's final comment as well). The first people to ask about the rule are those who instigated it. It's a bad rule _in the context of application development on a general purpose system_. But not everyone works in such a context, and before asking here, he really should find out why they have such a rule, from the people who made the decision. – James Kanze Jul 19 '13 at 12:57
  • If `CHAR_BIT != 8` (i.e., the types `char`, `unsigned char`, and `signed char` are wider than 8 bits) then `int8_t` will not exist. Another point: `int8_t` may be define either as `signed char` or as plain `char` (which are distinct types even if plain `char` is signed). – Keith Thompson Jun 03 '21 at 03:17

4 Answers4

20

The use of int8_t is perfectly good for some circumstances - specifically when the type is used for calculations where a signed 8-bit value is required. Calculations involving strictly sized data [e.g. defined by external requirements to be exactly 8 bit in the result] (I used pixel colour levels in a comment above, but that really would be uint8_t, as negative pixel colours usually don't exist - except perhaps in YUV type colourspace).

The type int8_t should NOT be used as a replacement of char in for strings. This can lead to compiler errors (or warnings, but we don't really want to have to deal with warnings from the compiler either). For example:

int8_t *x = "Hello, World!\n";

printf(x);

may well compile fine on compiler A, but give errors or warnings for mixing signed and unsigned char values on compiler B. Or if int8_t isn't even using a char type. That's just like expecting

int *ptr = "Foo";

to compile in a modern compiler...

In other words, int8_t SHOULD be used instead of char if you are using 8-bit data for caclulation. It is incorrect to wholesale replace all char with int8_t, as they are far from guaranteed to be the same.

If there is a need to use char for string/text/etc, and for some reason char is too vague (it can be signed or unsigned, etc), then usign typedef char mychar; or something like that should be used. (It's probably possible to find a better name than mychar!)

Edit: I should point out that whether you agree with this or not, I think it would be rather foolish to simply walk up to whoever is in charge of this "principle" at the company, point at a post on SO and say "I think you're wrong". Try to understand what the motivation is. There may be more to it than meets the eye.

Mats Petersson
  • 126,704
  • 14
  • 140
  • 227
  • 1
    I'll take issue with your second sentence. First, _all_ calculations will take place in `int` or larger, because of integral promotion. So you doubtlessly mean when you have to store small signed values in the least space possible. And for this, `signed char` is probably preferable. About the only time you should see `int8_t` in code is when the type has to match some external protocol or hardware. (And pixels should probably be `uint24_t`, with 8 bits per color, but that doesn't usually exist.) – James Kanze Jul 19 '13 at 11:41
  • 1
    And later: if `char` is 16 bits, then `int8_t` will not exist. It must be an addressable type (not a bit field), and `char` is required to be the smallest addressable type (supported in that C/C++ implementation, of course: some machines, such as VAX, supported addressing individual bits). – James Kanze Jul 19 '13 at 11:43
  • Very nice description for my question, thank you Mats Peterson. – sokid Jul 19 '13 at 11:47
  • @JamesKanze Ok, I have amended for your comments. I do think that the essense of my message was quite clear tho': `char` is not a replacement for `int8_t` and vice versa - they should not be used interchangeably. – Mats Petersson Jul 19 '13 at 11:47
  • Thanks for your answer, I understood, and my preferred name for "mychar" is "ascii_t", is it a good name? – sokid Jul 19 '13 at 11:53
  • `ascii_t` may be a better choice - although on an old IBM Mainframe, it would probably be accurate with `ebcdic_t` instead - are you going to have an `#ifdef` around every one of them? ;) – Mats Petersson Jul 19 '13 at 11:57
  • To be horribly pedantic, names ending in `_t` are reserved by POSIX, and shouldn’t be used for user-defined types. I can count the people who adhere to this guideline on one hand, however. – Stephen Canon Jul 19 '13 at 12:03
  • I'd like to add that a "naked" char IS NOT guaranteed to be signed. The standard allows compilers to compile it as either signed or unsigned. IF you want an 8 bit signed integer value, you MUST use int8_t. char should be used if you want an ASCII character. – Luciano Jul 16 '15 at 16:43
14

They simply make different guarantees:

char is guaranteed to exist, to be at least 8 bits wide, and to be able to represent either all integers between -127 and 127 inclusive (if signed) or between 0 and 255 (if unsigned).

int8_t is not guaranteed to exist (and yes, there are platforms on which it doesn’t), but if it exists it is guaranteed to an 8-bit twos-complement signed integer type with no padding bits; thus it is capable of representing all integers between -128 and 127, and nothing else.

When should you use which? When the guarantees made by the type line up with your requirements. It is worth noting, however, that large portions of the standard library require char * arguments, so avoiding char entirely seems short-sighted unless there’s a deliberate decision being made to avoid usage of those library functions.

Stephen Canon
  • 103,815
  • 19
  • 183
  • 269
  • Who guarantees two's complement? Reference please? – Kerrek SB Jul 19 '13 at 11:11
  • 7
    7.20.1.1 Exact-width integer types, paragraph 1: “The typedef name intN_t designates a signed integer type with width N, no padding bits, **and a two’s complement representation**.” (Emphasis mine). – Stephen Canon Jul 19 '13 at 11:12
  • Is it possible in an language/platform one byte != 8 bits **?** – Grijesh Chauhan Jul 19 '13 at 11:13
  • 1
    @GrijeshChauhan: Yes, in C a “byte” is defined to be “the size of a `char`”, and a `char` is defined to be the smallest addressable unit of memory. There exist architectures (mostly DSPs) on which the smallest addressable unit of memory is 16, 24, or even 32 bits. – Stephen Canon Jul 19 '13 at 11:15
  • This type probably doesn't exist in a platform where a byte is not 8 bits. Hence it is "not guaranteed to exist" - but if it does, it works in the way described, rather than in the less strictly defined way that `char` or `int` or `long` works in C and C++. – Mats Petersson Jul 19 '13 at 11:16
  • @MatsPetersson Just found a link, I like to share with you: [System where 1 byte != 8 bit?](http://stackoverflow.com/questions/5516044/system-where-1-byte-8-bit) – Grijesh Chauhan Jul 19 '13 at 11:17
  • So, you mean, my company guidelines is short-sighted? :-( But in some cases you cant guaranty as char is always 8bits, so that may a reason to avoid the confusion they don't want to use char type – sokid Jul 19 '13 at 11:19
  • stdint.h has typedefs for all fundamental types, why it is not defined for 'char'? – sokid Jul 19 '13 at 11:22
  • 1
    If your company is telling you to use `int8_t` for something that is in fact a `char` value for example in a string, yes. If you are using `int8_t` because you want a small integer, to perform calculations (e.g. pixel colour values), then `int8_t` is very much the right thing to do. Note that although it may well compile fine to use `int8_t *x = "Hello, World!\n"; printf(x);` on your platform, it may not work on another. – Mats Petersson Jul 19 '13 at 11:22
  • And `` does not define ALL types. It defines standard types that aren't part of the compiler itself (such as `char`, `int`, `long`, `float`, etc). – Mats Petersson Jul 19 '13 at 11:24
  • @MatsPetersson: Please do, I’ve already written all I care to on the subject. It’s still too early here. – Stephen Canon Jul 19 '13 at 11:25
  • 1
    @GrijeshChauhan There is at least one relatively modern platform where bytes are 9 bits (and `signed char` is 1's complement); it _has_ a C++ compiler. And I've heard that on some embedded platforms, bytes are 16 or 32 bits. – James Kanze Jul 19 '13 at 11:33
  • @JamesKanze Thanks! When I first read about it I was supper surprised! but its true. :) – Grijesh Chauhan Jul 19 '13 at 11:35
4

int8_t is only appropriate for code that requires a signed integer type that is exactly 8 bits wide and should not compile if there is no such type. Such requirements are far more rare than the number of questions about int8_t and it's brethren indicates. Most requirements for sizes are that the type have at least a particular number of bits. signed char works just fine if you need at least 8 bits; int_least8_t also works.

Pete Becker
  • 74,985
  • 8
  • 76
  • 165
1

int8_t is specified by the C99 standard to be exactly eight bits wide, and fits in with the other C99 guaranteed-width types. You should use it in new code where you want an exactly 8-bit signed integer. (Take a look at int_least8_t and int_fast8_t too, though.)

char is still preferred as the element type for single-byte character strings, just as wchar_t should be preferred as the element type for wide character strings.

Sneftel
  • 40,271
  • 12
  • 71
  • 104