26

ok so im reading this book: The C Programming Language - By Kernighan and Ritchie (second Edition) and one of the examples im having trouble understanding how things are working.

#include <stdio.h>

#define MAXLINE 1000

int getline(char line[], int maxline);
void copy(char to[], char from[]);

int main(int argc, char *argv[])
{
    int len;

    int max;
    char line[MAXLINE];
    char longest[MAXLINE];

    max = 0;
    while((len = getline(line, MAXLINE)) > 1)
    {
        if(len > max)
        {
            max = len;
            copy(longest, line);
        }
    }
    if(max > 0)
        printf("%s", longest);

    getchar();
    getchar();
    return 0;   
}

int getline(char s[], int lim)
{
    int c, i;

    for(i = 0; i < lim - 1 && (c = getchar()) != EOF && c != '\n'; ++i)
        s[i] = c;
    if(c == '\n')
    {
        s[i] = c;
        ++i;     
    }
    s[i] = '\0';

    return i;
}

void copy(char to[], char from[])
{
    int i;

    i = 0;
    while((to[i] = from[i]) != '\0')
        ++i;
}

the line : for(i = 0; i < lim - 1 && (c = getchar()) != EOF && c != '\n'; ++i) where it says c = getchar(), how can an integer = characters input from the command line? Integers yes but how are the characters i type being stored?

Thanks in advance

Rhexis
  • 2,414
  • 4
  • 28
  • 40
  • 1
    See also [what happens if you use `char c = getchar()` instead of `int`](http://stackoverflow.com/questions/35356322/difference-between-int-and-char-in-getchar-and-putchar) – Antti Haapala -- Слава Україні Feb 13 '16 at 09:17
  • Possible duplicate of [Why must the variable used to hold getchar's return value be declared as int?](https://stackoverflow.com/questions/18013167/why-must-the-variable-used-to-hold-getchars-return-value-be-declared-as-int) – phuclv Feb 22 '19 at 02:00
  • *The general rule is to keep the question with the best collection of answers*. The time difference isn't relevant [How should duplicate questions be handled?](https://meta.stackexchange.com/q/10841/230282) – phuclv Feb 22 '19 at 02:48

6 Answers6

37

Unlike some other languages you may have used, chars in C are integers. char is just another integer type, usually 8 bits and smaller than int, but still an integer type.

So, you don't need ord() and chr() functions that exist in other languages you may have used. In C you can convert between char and other integer types using a cast, or just by assigning.

Unless EOF occurs, getchar() is defined to return "an unsigned char converted to an int" (same as fgetc), so if it helps you can imagine that it reads some char, c, then returns (int)(unsigned char)c.

You can convert this back to an unsigned char just by a cast or assignment, and if you're willing to take a slight loss of theoretical portability, you can convert it to a char with a cast or by assigning it to a char.

Steve Jessop
  • 273,490
  • 39
  • 460
  • 699
  • so in C chars and integers are the "same" so to say. and when doing something like myInt = myChar; that works because of there ascii values? – Rhexis Aug 19 '11 at 09:31
  • 4
    @Flyphe: pretty much, yes. As far as C is concerned, a character is its numeric value. In fact, character literals in C, like `'a'`, have type `int` rather than type `char`. The numeric value doesn't strictly *have* to be ASCII, C implementations are actually allowed to use another encoding like EBCDIC, but it's pretty unlikely you'll ever encounter that. – Steve Jessop Aug 19 '11 at 09:44
  • Note that the char type is a type of it's own: the smallest possible integer type, usually 1 byte wide. So it is not only used for storing ASCII letters, it is also commonly used when working with small numbers 0-255 (unsigned) or -128 to 127 (signed), to save memory. Had you used an int you would need 2 or 4 bytes instead of 1. – Lundin Aug 19 '11 at 11:59
11

The getchar() function returns an integer which is the representation of the character entered. If you enter the character A, you will get 'A' or 0x41 returned (upgraded to an int and assuming you're on an ASCII system of course).

The reason it returns an int rather than a char is because it needs to be able to store any character plus the EOF indicator where the input stream is closed.

And, for what it's worth, that's not really a good book for beginners to start with. It's from the days where efficiency mattered more than readability and maintainability.

While it shows how clever the likes of K&R were, you should probably be looking at something more ... newbie-friendly.

In any case, the last edition of it covered C89 and quite a lot has changed since then. We've been through C99 and now have C11 and the book hasn't been updated to reflect either of them, so it's horribly out of date.

paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
  • but then how come near the end it prints out the characters typed? – Rhexis Aug 19 '11 at 09:22
  • ohh ok. so s[i] = c; in the for loop is grabbing each character and storing it in "s" while still making sure its not EOF or \n – Rhexis Aug 19 '11 at 09:26
  • 1
    Can you recommend a modern C book that's actually any good, in the sense of having something approximating the rigour that we've come to take for granted from the likes of K&R, or from any of 20 or 30 people on SO that frequently answer C questions more correctly than run-of-the-mill textbooks seem to manage? – Steve Jessop Aug 19 '11 at 09:39
  • Recommend a book? No. Seriously, it's been decades since I used one for C. Using SO for specific questions is okay but not so good for structured, serialised learning of a language so I can't really help out there. I know I wouldn't use a book that had `for(i = 0; i < lim - 1 && (c = getchar()) != EOF && c != '\n'; ++i)` in it though. Well, at least not for beginners, unless they were someone I really disliked :-) – paxdiablo Aug 19 '11 at 09:49
  • I guess maybe there aren't enough beginners learning C for it to be worth writing a book for them, or anyway not a book that will turn them into pedantic language-lawyers as opposed to passing their university course with non-portable code then forgetting C until they encounter it in the wild. Maybe you either learn Java first, so that you're comfortable with the syntax when you come to learn the C computation model, or else learn an assembler first so that you're comfortable with the computation model when you come to learn the C syntax. Meh. – Steve Jessop Aug 19 '11 at 10:06
  • I mean, I'm not particularly keen on that line either but if you know Java or C# reasonably well, it's not impenetrable even if you're a total C newbie. – Steve Jessop Aug 19 '11 at 10:10
  • 2
    There is nothing wrong with the book im using, Dennis Ritchie created the C language. Following along with the content is reasonably strait forward. Im sure over time i will get it all and understand it all. Honestly though, the book is fine for beginners. – Rhexis Aug 19 '11 at 10:14
  • 3
    I disagree: the book is _not_ fine for beginners, you should be learning how to write readable, maintainable code, not unnecessarily complex one liners like that example. A decent compiler will give you the same underlying machine code whether you feed it that monstrosity or the equivalent five lines of decent source code. Let me know how things go at your first code review if you produce something like that :-) – paxdiablo Aug 19 '11 at 12:02
  • I was about to post the very same thing. K&R is very bad for beginners, it should not be used for anything beyond being a reference for quick & dirty syntax lookup, or nostalgia. Learning programming != learning program language syntax. – Lundin Aug 19 '11 at 13:44
  • Efficiency matters today just as much as it did back then. Most of the high performance computing tasks need to leverage as much efficient computation as possible given the scale of data and expense of performing the computational tasks. But anyhow, that's just my view of the matter. – xbsd Feb 04 '14 at 13:02
  • @xbsd, there are _areas_ where efficiency does still matter but the _major_ problems with most code nowadays is maintainability rather than speed. If I had to maintain the code, I'd rather have it readable even if that meant code that ran 3% slower. Today's insanely optimising compilers aren't the "dumb" beasts of yesteryear :-) – paxdiablo Feb 04 '14 at 13:13
5

The C char type is 8 bits, which means it can store the range of integers from (depending on if it is signed or not and the C standard does not dictate which it is if you do not specify it) either -128 to 127 or 0 to 255 (255 distinct values; this is the range of ASCII). getchar() returns int, which will be at least 16 bits (usually 32 bits on modern machines). This means that it can store the range of char, as well as more values.

The reason why the return type is int is because the special value EOF is returned when the end of the input stream is reached. If the return type were char, then there would be no way to signal that the end of the stream was encountered (unless it took a pointer to a variable where this condition was recorded).

klutt
  • 30,332
  • 17
  • 55
  • 95
cdhowie
  • 158,093
  • 24
  • 286
  • 300
  • 2
    The `C` char type is not guaranteed to be signed, and it's not all that unusual for it to be unsigned. For example that's default on ARM with gcc and other compilers. – Steve Jessop Aug 19 '11 at 09:32
  • It's not guaranteed to be unsigned either, from my understanding. – cdhowie Aug 19 '11 at 09:33
  • 2
    Correct, it's not guaranteed to be unsigned either. It is guaranteed to be the same representation as one of `signed char` and `unsigned char`, so its range is the same as one of those. You just can't say for sure that it's -128 to 127. You can't say for sure it's 8 bits, either, but it's very fashionable for it to be 8, the exceptions are ancient 9-bit mainframes and some DSP chips that have 16- or 32-bit chars. – Steve Jessop Aug 19 '11 at 09:33
  • You don't need to send a pointer, there's feof() and ferror(), but unfortunately no fast macro versions of them. Expanding characters to int only to be able to squeeze in a special EOF return value seems like a really bad optimization choice, making C more complicated and restricted. Like NULL and '\0'-terminated strings. C++ fixed the EOF/char/int mistake: you test the stream, not the "character". – potrzebie Jun 09 '14 at 15:04
  • The C `char` type is not guaranteed to be exactly 8 bits. It is guaranteed to be *at least* 8 bits, but `CHAR_BIT` may not be exactly 8. – Govind Parmar Feb 22 '19 at 17:47
0

Now let's play a game of logic.

Char is also a type of integer which has a smaller range than int, more specifically 8 bits, that is, 1 byte. As we all know, integer types consists of signed ( default ) and unsigned. As for char, the range of signed is -127 ~ 128 and the range of unsigned is 0 ~ 255. Now we know the type and "capability" of signed and unsigned char.

We human understand characters while the computer recogonize only binary sequence. Thus all kinds of programming language must provode a model to deal with the cevertion from characters to binary sequence. ASCII code is the standard for the mapping which applied in C and many other programming languages. It takes 0 - 255 to code basic characters like 0-9, a-z and A-Z, as well as usual special ones.

You may wonder that unsigned char is the exact choice. However, the progamming should know when to stop. The simplest way is to meet a special value, a negative one is a good choice since bigger positive values might be used for other languages. Finally, C choosed -1, which is more commonly called EOF.

Now we've got the point. Signed char will not suffice to code ASCII characters while unsigned leaves no room for the termination value. We require a larger range to balace this, that is, the int type. Savy?

Thanks for the answer of @cdhowie, it acually kindled me.

Rick
  • 1
  • 1
-1

Every character (including numbers) entered on the command line is read as a character and every character has an integer value based on its ASCII code http://www.asciitable.com/.

Matt
  • 334
  • 5
  • 15
  • C standard doesn't actually guarantee that ASCII is the character set used by the implementation, although you'd go a long way, possibly to a museum, to find a C implementation where it isn't. – Steve Jessop Aug 19 '11 at 09:30
  • Yes but for simplicity I figured that was assumed – Matt Aug 19 '11 at 09:33
  • @Steve: err, we work on machines everyday that use EBCDIC. In fact, I'd warrant that every single one of your bank transactions ends up at such a machine. The venerable System z mainframes, still running the finances of the planet after so many decades :-) – paxdiablo Aug 19 '11 at 09:41
  • @paxdiablo: excellent, didn't realise. Hopefully, they don't let people just walk in the door and start programming those specific ones running the banking system. My point is just that you have to make an effort to get away from ASCII, it probably won't happen by accident while you're not looking. – Steve Jessop Aug 19 '11 at 09:53
-3

Answer for your Question is answered. But just add 1 more thing.

As you are declaring variable c as int. It is pretty clear that you are taking values from 0 to 9 having ascii value of 48-57. So you can just add 1 more line to the code-

c = c-48.

Yogeesh Seralathan
  • 1,396
  • 4
  • 15
  • 22