Q1-the text editor append EOF to the content of the file as abcd-1, is it correct?
Incorrect. EOF is not something that is stored in your file. It is a C-language construct to indicate that all file content has been read, and that the stream is at the end-of-file.
Q2-What happen if the content of the file is ab-1cd?
Irrelevant. There is no EOF character that can be inserted into the file stream.
A file is almost always represented as a sequence of bytes, where a byte is an 8-bit unit which can represent values from 0 (0x00) to 255 (0xFF). This is what we call raw or binary data. Those values are assigned meaning according to the encoding of the file.
For example the ASCII encoding indicates that the value 65 (0x41) represents the character A
, 66 B
, and so on. The ASCII character set does have a number of control codes like 3 ETX
(end-of-text), but these are obsolete and have no practical modern meaning.
A file stored in a filesystem has an intrinsic length which indicates the number of bytes in the file. Thus, the "end of file" occurs after the last byte, indicated by that length.
Interestingly (from Wikipedia):
Some operating systems such as CP/M tracked file length only in units of disk blocks and used Control-Z to mark the end of the actual text in the file.
So where does EOF
come into play?
EOF
is a construct, a constant defined to -1, which is used by the <stdio.h>
API (e.g. fread
) to indicate that the end-of-file has been reached. You should remember though, that fread
is an abstraction over a lower-level system call interface (e.g. read
).
On success, the number of bytes read is returned (zero indicates end of file)
Let's consider an ASCII file of size 3 that has the content ABC
. In a hex editor, it would look like this:
0000 41 42 43 ABC
Now, we run the following code:
#include <unistd.h>
#include <fnctl.h>
int main(void)
{
int fd = open("ourfile.txt", O_RDONLY);
char c;
read(fd, &c, 1); // returns 1, c gets 'A'
read(fd, &c, 1); // returns 1, c gets 'B'
read(fd, &c, 1); // returns 1, c gets 'C'
read(fd, &c, 1); // returns 0, c is unmodified
}
So you see that end-of-file is a state that is indicated, and not an actual data value.