0

I was writing a programm which must read a word, which is separated from others by a ' ' or \n, from a file (I decided to use 'read' system function for it) and faced up an issue with it.

In particular, as it is written in manual, 'read' must return 0 when there is nothing else to read (EOF is reached), but in my case it returns \n (I checked the ASCII code of returned symbol and it's 10which is of \n and checked my program a number of times and it always returns the same). Here's the code

char *read_word_from_file(int fd, int *flag)
{
    int i = 0;
    char ch, *buf = NULL;
    if (!read(fd, &ch, 1)) {    //file is empty
        *flag = 1;
        return NULL;
    }
    while (ch != ' ' && ch != '\n') {
        if (!(buf = (char *) realloc(buf, i + 1))) goto mem_err;
        buf[i++] = ch;
        if (!(read(fd, &ch, 1))) {
            *flag = 1;
            break;
        }
    }
    buf[i] = '\0';
    return buf;
mem_err:
    perror("realloc");
    exit(1);
}

(flag variable is used to indicate the EOF for the outer function, which calls this one) So, my question is "is such behavior is normal, or I made a mistake somewhere?"

P.S. - Off-topic question, how do you make a part of text(a single word) shadowed like a code samples?

S.I.J
  • 979
  • 1
  • 10
  • 22
  • Did you try `fdopen` + `fseek` + `fscanf` or `fgetc` – Mohit Jain Dec 15 '15 at 10:37
  • to answer the offtopic: use backticks, like this `code` word. – Sourav Ghosh Dec 15 '15 at 10:37
  • 2
    read() reads a newline because there was a newline in the input. What is the problem? – Jens Dec 15 '15 at 10:39
  • 1
    The value in `ch` is *not* the return value. If you want the return value, then you need something like `retval = read(fd, &ch, 1);`. The value 10 in `ch` is there from the previous successful call to `read`. `read` doesn't change the value in `ch` when it fails. – user3386109 Dec 15 '15 at 10:45
  • 1
    @user3121023 It returns 0 when there's nothing left to read. – user3386109 Dec 15 '15 at 10:46
  • @user3386109 I printed ch after each of the `read` calls and the value does change, because i have some letters before EOF in my file and it still '\n' which is stored in `ch` after the last call – Vladislav Kolesnikov Dec 15 '15 at 10:49
  • @Jens I didn't understand what input you are talking about, sorry – Vladislav Kolesnikov Dec 15 '15 at 10:51
  • read should be used something like bytes_read = read(fd, &ch, 1); if ( bytes_read > 0) { check ch} else error – Ritesh Dec 15 '15 at 12:08
  • @VladislavKolesnikov It seems you are confused about where read() returns what information. In `retval = read(fd, buffer, count)` the input characters from the file descriptor are stored in buffer; the return value from read is `retval`. If retval is < 0, an error occurred; if retval is 0, there's nothing more to read; if retval > 0, that many characters were copied to buffer. The retval is never 10 in your case, since you always ask for 1 byte. – Jens Dec 15 '15 at 12:24
  • @Jens I never said that retval is 10, I said that the ascii code of the character stored in the ch after the call is ten – Vladislav Kolesnikov Dec 15 '15 at 15:41
  • 1
    @VladislavKolesnikov OK. And I said that this is because there was a newline read from the file descriptor. I don't understand your problem. If read() reads a 10, then there was a 10 in the input. There is nothing surprising happening here. Or do you look at ch even if read() didn't read anything (retval != 1)? You shouldn't. The buffer only contains meaningful content when read returns > 0. Maybe you could tell us how you open the fd and what input you provide? That would be very helpful. – Jens Dec 15 '15 at 15:55
  • @Jens well, I just can't understand why after the last char in my file (let it be M) does 'read' reads 1 character into ch (changes it from M to '\n'), returns 1 (number of bytes read) when my file doesnt contain any characters after M (including newlines). I open my file simply with an 'open' function with O_RDONLY flag – Vladislav Kolesnikov Dec 15 '15 at 16:59
  • @VladislavKolesnikov How do you know there's no newline in the file? What is the file size (`ls -l file`)? What is its contents (`od -c file`)? – Jens Dec 15 '15 at 17:30
  • @Jens oh, 'od -c' shows that there is newline at the end, though if opening it with gedit it seems that there isnt any. Now can i change my origin question to why is it in this way? – Vladislav Kolesnikov Dec 15 '15 at 19:04
  • 1
    @VladislavKolesnikov Why is what in which way? gedit simply doesn't show newlines in files. Why is there a newline in your file? Depends on how you created that file. If you used for example `puts("M");` then you should understand that puts() adds a newline. If you created it with an editor, you should understand that editors usually write complete lines ending in a newline. – Jens Dec 15 '15 at 19:28
  • @Jens well, if you say its all because of editors, thanks, then its clear now. Thank you very much – Vladislav Kolesnikov Dec 15 '15 at 19:51
  • @VladislavKolesnikov, you don't see `\n` character in gedit, because it is a **control character**. It's invisible, except for the effect it produces on the terminal (or in the file) to separate lines from each other. But they are there. Other control characters of interest are `\r` (13, carry return, makes printing continue at first column in same line), `\a` (7, bell, makes terminal emit a tone), `\t` (9, tab, makes printing continue at the next tab stop). etc. See `ascii(7)` for a description fo control characters. – Luis Colorado Dec 16 '15 at 09:51
  • Read [read(2)](http://man7.org/linux/man-pages/man2/read.2.html) man page. It can fail then return -1. You should use the return count, and for efficiency reasons, you'll better read into a buffer of several kilobytes (e.g. 4kbytes to 1Mbytes). Calling `read` once per byte is *very inefficient*. BTW, why can't you use *stdio* functions? They are buffering! – Basile Starynkevitch Oct 26 '17 at 12:25
  • Also, `realloc` ing every time is very inefficient. Consider using some `newsize = 4*oldsize/3+10;` scheme and have some "geometric" progression. – Basile Starynkevitch Oct 26 '17 at 12:29
  • Consider using [getline(3)](http://man7.org/linux/man-pages/man3/getline.3.html) like [here](https://stackoverflow.com/a/9171511/841108). – Basile Starynkevitch Oct 26 '17 at 12:31

1 Answers1

1

gedit simply doesn't show newlines in files. Why is there a newline in your file? Depends on how you created that file. If you used for example puts("M"); then you should understand that puts() adds a newline. If you created it with an editor, you should understand that editors usually write complete lines ending in a newline. – Jens

Armali
  • 18,255
  • 14
  • 57
  • 171