How does the STDIN buffer and getchar() pointer change during successive calls?

Question

Given input in the stdin buffer, when successive calls to getchar() are performed, does the pointer move along the memory address of the stdin buffer, allowing getchar() to retrieve the value at each address? If so, once they have been retrieved are the values removed and the pointer then incremented?

Generally my understanding of getchar() in a loop follows this logic:

getchar() called
stdin buffer checked for input
If stdin buffer empty, getchar() sleeps
user enters input and awakens get char()
stdin buffer checked again for input
stdin buffer not empty
getchar() retrieves value at address at the start of the stdin buffer
value at address removed from stdin buffer, pointer incremented
subsequent calls repeat steps 7-8 until EOF encountered

A similar question was asked before on stackoverflow but I had trouble understanding the responses.

@ThomasPadron-McCarthy It looks like he's asking further explanation of points he didn't understand: it's a follow-up, not a duplicate. — edmz, Jul 27 '14 at 15:27
You are using right words but in wrong contexts. The "buffer" you are talking about is not an in-memory buffer, it's an abstract stream of characters. Do not confuse it with the real file buffer that may or may not exist in program memory. The "address" in the stream is not a memory address, it's an abstrat stream position. The "pointer" is not a `char*` but again a position. It's better not to assign new non-standard meaning to familiar words. — n. m. could be an AI, Jul 27 '14 at 16:31
You seem to be suffering from very much the same delusions that the person who asked the other question was suffering from. Your step 9 is categorically wrong. The buffer may be filled many times (your question implies that it is filled just once), so step 9 should be 'repeat steps 7-8 until the buffer is empty, then go back to 3, unless there is no more input in which case return EOF'. There are still niggles with steps 2-8, but they're addressed in the answers to the other question. — Jonathan Leffler, Jul 28 '14 at 04:15
@n.m. Your explanation and the user 'codenheim' seem to differ? Would you be able to explain in more detail why? K&R's book explains text input as a stream of characters, but what exactly does this mean and where is this stored if not the STDIN buffer? Specifically, when I enter input on the command line and press return, where does getchar() go to retrieve a character? Thank you. — jma1991, Jul 28 '14 at 08:32
A stream can well be unbuffered, in which case the program will ask the device directly (or via OS if there is one). *If* there is a buffer, *then* it typically contains a small portion of the stream, often a single line. Address and position within this buffer (not visible to the application programmer and thus meaningless and not interesting anyway) are unrelated to the file pointer and its position (accessible by the programmer). When the buffer ends, the program goes to the device for another portion of input. The device may have a buffer of its own. — n. m. could be an AI, Jul 28 '14 at 09:35
OP asked _does the pointer move along the memory address of the stdin buffer, allowing getchar() to retrieve the value at each address_ so I take that to mean he wanted to know about stdio (STDIN) physical implementation. If so, there is, in fact, a real memory buffer. If I misunderstood which buffer the OP meant, then we are indeed talking about 2 different concepts (implementation detail vs the virtual stream pointer). — codenheim, Jul 28 '14 at 09:50
If you modify your 9 steps program so that it loops back to step 1 instead of 7, you will get a more or less accurate description of buffered input. My point is that buffering is optional, and mostly transparent for the application. — n. m. could be an AI, Jul 28 '14 at 09:55
@jma1991 - Regardless of the implementation details (which you can consider Operating System library), you do not have to be concerned with the inner workings of it, unless you are just interested. If you can frame your question in the form of what you are trying to accomplish, or in what context you would like to know, it would be easier to answer. If you want to know how to implement stdio, there is a POSIX spec available, and there are source implementations available ranging from ATT, BSD, Linux and so forth. I studied all of the above when I had to implement stdio. — codenheim, Jul 28 '14 at 09:56
@n.m. - I don't disagree. I am really confused what the OP needs to know. — codenheim, Jul 28 '14 at 09:57
@codenheim if there is a buffer, then yes. The buffer contains a small portion of the stream, not the entire stream. The pointer will move forward to the end of the buffer, then probably jump back when another portion is read. AFAICT OP implies that the pointer always moves forward, because of the confusion with the "file pointer" (current position) concept. — n. m. could be an AI, Jul 28 '14 at 10:01

codenheim · Accepted Answer · 2014-07-27T16:30:22.253

2

Generally there is a stdio internal buffer. getchar() may trigger a line read into the buffer, and generally on subsequent calls, it will simply increment a pointer until the pointer reaches the end of the current data in the buffer. The implementation usually uses a simple internal char * to an underlying chunk of dynamic memory, with a few pointers and state variable(s).

Implementations vary, I don't recall the POSIX standard implying much about the internal implementation of getchar() or stdio streams in general, except that given operations should be supported.

If I recall, some implementations are unbuffered (I think the DOS compiler I used did not buffer), but there are multiple standard lib implementations for a given OS.

It is not uncommon to have 2 stdio libs on the same system, example: sys-admins managing AIX, Solaris, HPUX, and other non-Linux/BSD UNIX platforms will frequently install the GNU stack to get tools like gcc, and that stack includes glibc (GNU LIBC).

You can download a libc/stdio source online. See glibc.

If it helps, consider that stdio provides peek and unget functionality, and the only way to do that is by an internal buffer between the terminal and the user program.

edited Jul 27 '14 at 16:30

answered Jul 27 '14 at 15:54

codenheim

20,467
1
59
80

1

*[I]t's generally pretty rare nowadays to see more than one* is true in that a given *instance* of a machine uses only one, but it is not at all rare for there to be multiple available instances in embedded systems, one of which is then selected at compile time. Also, it's worth noting that while it's useful to understand how one *might* work, it's unwise to assume that *all* work this way for just the reason that you mentioned: "Implementations vary." – Edward Jul 27 '14 at 16:06
@Edward - Actually I take my original answer back, I used to install the GNU stack on every Solaris/AIX box I administered, which included glibc, resulting in 2 implementations. Solaris also offers the GNU stuff in a standard package set. I will edit my answer. – codenheim Jul 27 '14 at 16:27
@codenheim Your explanation and the user 'n.m.' seem to differ? Would you be able to explain in more detail why? Thank you. – jma1991 Jul 28 '14 at 08:33
Based on the part of your question where you asked _does the pointer move along the memory address of the stdin buffer, allowing getchar() to retrieve the value at each address_ I assumed you were asking about the physical buffer which is an implementation detail (as in the source code for a STDIO library). I described it because I wrote an implementation myself from the POSIX spec. If this is what you are asking, then I disagree with the other user comment, in that he has assumed incorrectly which buffer you are referring to (a physical vs a virtual STREAM concept). Only you can clarify. – codenheim Jul 28 '14 at 09:45

How does the STDIN buffer and getchar() pointer change during successive calls?

1 Answers1