11
  1. chars in 'C' are used to represent characters.
  2. Numbers representing characters in all code pages are always positive.

What is the use of having signed characters?? Are negative values contained in chars used only as integral values in a smaller integral data-type than int and short?? Do they have no other interpretation??(like positive values in chars representing characters)

Abhijith Madhav
  • 2,748
  • 5
  • 33
  • 44

8 Answers8

12

chars in 'C' are used to represent characters.

Not always, chars are used to represent bytes, they are the only type in c with a known size.

Drew Dormann
  • 59,987
  • 13
  • 123
  • 180
Martin Beckett
  • 94,801
  • 28
  • 188
  • 263
5

Only characters of the basic execution character set are guaranteed to be nonnegative (C99, 6.5.2 §3):

An object declared as type char is large enough to store any member of the basic execution character set. If a member of the basic execution character set is stored in a char object, its value is guaranteed to be nonnegative. If any other character is stored in a char object, the resulting value is implementation-defined but shall be within the range of values that can be represented in that type.

You have to discern between the 'plain' char type and the types signed char and unsigned char as well: signed char and unsigned char are ordinary integer types for which the following holds (C99, 6.5.2 §5):

An object declared as type signed char occupies the same amount of storage as a ‘‘plain’’ char object.

Christoph
  • 164,997
  • 36
  • 182
  • 240
  • Further, if `char` is signed on your platform and you read a character with a codepoint greater than `CHAR_MAX` (say a character like æ in ISO-8859-1, which has codepoint `0xE6`), you're quite likely to get a negative char value. – caf Dec 21 '09 at 23:10
4

Numbers representing characters in all code pages are always positive.

Erm... wrong!?

From the C99 standard, emphasis mine:

If a member of the basic execution character set is stored in a char object, its value is guaranteed to be positive.

It is not guaranteed that all valid characters of all code page are positive. Whether char is signed or unsigned is implementation defined!

DevSolar
  • 67,862
  • 21
  • 134
  • 209
2

From Jack Klein's Home Page:

Signed char can hold all values in the range of SCHAR_MIN to SCHAR_MAX, defined in limits.h. SCHAR_MIN must be -127 or less (more negative), and SCHAR_MAX must be 127 or greater. Note that many compilers for processors which use a 2's complement representation support SCHAR_MIN of -128, but this is not required by the standards.

From what I can tell, there's no official "meaning" of signed char. However, one thing to be aware of is that all the normal ASCII characters fall in the 0-127 range. Therefore, you can use the signed char type to restrict legal values to the 0-127 range, and define anything less than 0 as an error.

For example, if I had a function that searches some ASCII text and returns the most frequently occurring character, perhaps I might define a negative return value to mean that there are two or more characters tied for most frequent. This isn't necessarily a good way to do things, it's just an example off the top of my head.

jakeboxer
  • 3,300
  • 4
  • 26
  • 27
  • 2
    Very simple, there are 3 types of `char`: `unsigned char`, `signed char` and `char`. The former two are explicit and used for manipulating the smallest numeric data type. However, `char`, is implementation defined whether it is signed or unsigned. In summary, when sign is significant, add the qualifier. – Thomas Matthews Dec 21 '09 at 17:57
2

Just beware of using plain chars as array indexes.

char buf[10000];
fgets(buf, sizeof buf, stdin);
unsigned charcount[UCHAR_MAX] = {0};
char *p = buf;
while (*p) {
    charcount[*p]++; /* if (*p < 0) BOOM! */
    // charcount[(unsigned char)*p]++;
    p++;
}
pmg
  • 106,608
  • 13
  • 126
  • 198
1

It's worth noting that char is a distinct type from both signed char and unsigned char.

0

In C and C++ chars can be signed or unsigned. A char variable can be used to hold a small integer value. This is useful for several reasons:

  • On small machines, e.g. an 8-bit micro. It might allow more efficient access and manipulation.
  • If you want to have a large array of small values, say 100K, you can save a bunch of memory by using an array of chars, rather than. e.g. ints.

In C, a character literal is an integer constant. '0' is equal to 48.

Richard Pennington
  • 19,673
  • 4
  • 43
  • 72
0

In C, a char (including signed char and unsigned char) is used to store a byte, which the C standard defines as a small integer at least 8 bits in size.

Having signed and unsigned bytes is as useful as having larger integers. If you're storing a very large number of small numbers (0..255 for unsigned, -127..127 for signed[1]) in an array, you may prefer to use bytes for them rather than, say, short ints, to save space.

Historically, a byte and a text character were pretty much the same thing. Then someone realized there are more languages than English. These days, text is much more complicated, but it is too late to change the name of the char type in C.

[1] -128..127 for machines with two's complement representation for negative numbers, but the C standard does not guarantee that.

  • Actually, der term "byte" is defined nowhere. In C/C++ chars are sequences of bits, and so is any object. But `int` is "the natural size suggested by the architecture of the execution environment" (C++11 §3.9.1/2). So the standard may define the term machine-word, but not machine-byte. `char` is not even the smallest addressable memory unit. For example, to define a 4-bit-char use `struct char4 { unsigned int c : 4; }` – Andreas Spindler Oct 19 '12 at 08:25