8

I am trying to execute the below program.

#‎include‬ "stdio.h" 
#include "string.h" 

void main()
{ 
    char c='\08'; 
    printf("%d",c); 
} 

I'm getting the output as 56 . But for any numbers other than 8 , the output is the number itself , but for 8 the answer is 56.

Can somebody explain ?

4 Answers4

18

A characters that begins with \0 represents Octal number, is the base-8 number system, and uses the digits 0 to 7. So \08 is invalid representation of octal number because 8 ∉ [0, 7], hence you're getting implementation-defined behavior.

Probably your compiler recognize a Multibyte Character '\08' as '\0' one character and '8' as another and interprets as '\08' as '\0' + '8' which makes it '8'. After looking at the ASCII table, you'll note that the decimal value of '8' is 56.


Thanks to @DarkDust, @GrijeshChauhan and @EricPostpischil.

Maroun
  • 94,125
  • 30
  • 188
  • 241
  • 2
    GCC and clang both complain that it's a multi-character constant, so it seems they interpret it character '\0' + '8'. And indeed, '8' == ASCII decimal 56. – DarkDust Aug 12 '13 at 09:45
  • gcc 4.6.3 does not complain. – Oswald Aug 12 '13 at 09:50
  • @Oswald: try `-Wmultichar` (which on my machine seems to be on by default). – DarkDust Aug 12 '13 at 09:54
  • @DarkDust Silly me tested with `'\07'` (no warning); `'\08'` gives a warning. – Oswald Aug 12 '13 at 09:58
  • 2
    Please, whoever downvoted, please explain why so I can learn from my mistakes, this question is really interesting and I really want to know if I'm right/wrong. – Maroun Aug 12 '13 at 10:36
  • http://stackoverflow.com/questions/14264458/strlen-the-length-of-the-string-is-sometimes-increased-by-1/14264498#14264498 – Grijesh Chauhan Aug 12 '13 at 13:03
  • 1
    *`Characters that begins with \ represents octal number `* is bit wrong it should be `\0` u forgot `0` – Grijesh Chauhan Aug 12 '13 at 13:10
  • further improve `undefined behavior`---->`implementation-defined` – Grijesh Chauhan Aug 12 '13 at 13:57
  • 3
    @MarounMaroun: This not undefined behavior; it is implementation-defined behavior. Per C 2011 (N1570) 6.4.4.4 1, a character-constant may be a sequence of characters. In `'\08'`, `\0` is recognized as one character and `8` as another. This is called a multibyte character (not a multi-character constant). Per 6.4.4.4 2, it is mapped in an implementation-defined way to a member of the execution character set. – Eric Postpischil Aug 12 '13 at 14:01
  • @EricPostpischil Suppose one compiler supports multi-character constant then is `'abc'` possible? and what can be proper use of multi-character constant? – Grijesh Chauhan Aug 12 '13 at 16:31
  • 2
    @GrijeshChauhan: These are not multi-character constants. They are multibyte characters. A C implementation may support character sets larger than the basic character set. They might include accented characters or letters from different languages or symbols. Multibyte characters are one way of writing source code that represents those characters. – Eric Postpischil Aug 12 '13 at 17:20
8

The value '\08' is considered to be a multi-character constant, consisting of \0 (which evaluates to the number 0) and the ASCII character 8 (which evaluates to decimal 56). How it's interpreted is implementation defined. The C99 standard says:

An integer character constant has type int. The value of an integer character constant containing a single character that maps to a single-byte execution character is the numerical value of the representation of the mapped character interpreted as an integer. The value of an integer character constant containing more than one character (e.g., 'ab'), or containing a character or escape sequence that does not map to a single-byte execution character, is implementation-defined. If an integer character constant contains a single character or escape sequence, its value is the one that results when an object with type char whose value is that of the single character or escape sequence is converted to type int.

So if you would assign '\08' to something bigger than a char, like int or long, it would even be valid. But since you assign it to a char you're "chopping off" some part. Which part is probably also implementation/machine dependent. In your case it happens to gives you value of the 8 (the ASCII character which evaluates to the number 56).

Both GCC and Clang do warn about this problem with "warning: multi-character character constant".

DarkDust
  • 90,870
  • 19
  • 190
  • 224
  • 1
    For the record: Visual C++ 2010 warns with "warning C4125: decimal digit terminates octal escape sequence", but you must set /W4 (warning level) option, where /W3 is the default. – Wacek Aug 12 '13 at 10:06
4

\0 is used to represent octal numbers in C/C++. Octal base numbers are from 0->7 so \08 is a multi-character constant, consisting of \0, the compiler interprets \08 as \0 + 8, which makes it '8' whose ascii value is 56 . Thats why you are getting 56 as output.

Grijesh Chauhan
  • 57,103
  • 20
  • 141
  • 208
Umer Farooq
  • 7,356
  • 7
  • 42
  • 67
4

As other answers have said, these kind of numbers represent octal characters (base 8). This means that you have to write '\010' for 8, '\011' for 9, etc.

There are other ways to write your assign:

char c = 8;
char c = '\x8'; // hexadecimal (base 16) numbers 
Roddy
  • 66,617
  • 42
  • 165
  • 277
tohava
  • 5,344
  • 1
  • 25
  • 47