2

I wanted to read the content of a file using the read() function. I tried the following:

#define BUFFER_LENGTH (1024)

char buffer[BUFFER_LENGTH];

// The first version of the question had a typo:
// void read_file(const char filename)
// This would produce a compiler warning.
void read_file(const char *filename)
{
    ssize_t read_bytes = 0;

    // The first version had the mode in hex instead of octal.
    //
    //     int fd_in = open(filename, O_RDONLY, 0x00644);
    //
    // This does not cause problems here but it is wrong.
    // The mode is now octal (even if it is not needed).
    int fd_in = open(filename, O_RDONLY, 0644);
    if (fd_in == -1)
    {
        return;
    }

    do
    {
        read_bytes = read(fd_in, buffer, (size_t) BUFFER_LENGTH);
        printf("Read %d bytes\n", read_bytes);

        // End of file or error.
        if (read_bytes <= 0)
        {
            break;
        }
    } while (1);

    close(fd_in);
}

I am using 'gcc (GCC) 3.4.2 (mingw-special)' on a Windows 7 system.

The strange behaviour I get is that not all the content is read. For example, I have a file

05.01.2012  12:28            15.838 hello.exe

and when I try to read it I get:

Read 216 bytes
Read 0 bytes

As far as I know read() should keep reading until it reaches the end of the file. While does it report an end of file (0) the second time it is called?

Maybe I am missing something obvious but I cannot see it. I have read this document and this document over and over again and I cannot find what I am doing wrong. Does anyone have any clue?

EDIT

Thanks for the hint! It is a typo in the question (I have corrected it). It is correct in the source code.

Giorgio
  • 5,023
  • 6
  • 41
  • 71
  • 1
    Give us a hexdump of hello.exe (first 220 bytes) - I suspect byte 217 to be EOF (26, 0x1A) in which case you would need to look at your open mode - O_NOTRANS maybe – Eugen Rieck Jan 05 '12 at 13:47
  • One minor point: you don't need the `0x00644` here, which is only used if the file is created. And if you did need it, it should be in octal, not hex: `0644`. – Joseph Quinsey Jan 05 '12 at 13:53
  • 1
    0x0644 is hex, corresponding to 03104 - most likely not the mode you want to open the file in. Since you are opening the file read-only, you can simply omit the mode altogether, as it is only relevant if *creating* a file. That's not the reason for your problem, but likely a reason for *later* problems if you don't break the habit of confusing hex and octal. ;-) – DevSolar Jan 05 '12 at 13:56
  • Try to inspect the [`errno`](http://linux.die.net/man/3/errno) variable after the `read` call. – actual Jan 05 '12 at 14:06
  • The only bug in the code snippet above is the third parameter of the `open()` - it is redundant in this case and also it is invalid (obviously had in mind 0644, not 0x644). But it is not critical in this case and your code works fine. – praetorian droid Jan 05 '12 at 14:17
  • @Eugen Rieck: Yes! That's the problem: byte 217 is 0x1A. – Giorgio Jan 05 '12 at 14:26
  • I will give this as an answer – Eugen Rieck Jan 05 '12 at 14:28

4 Answers4

9

I suspect byte 217 to be EOF (26, 0x1A) - in Windows files can be opened in "text" or "binary" mode. In text mode, a 0x1A is interpreted as EOF.

You would need to look at your open mode - O_BINARY. In PHP this is why you must fopen with mode "rb" (READ BINARY) and not "R" ("R" which defaults to READ TEXT).

http://www.mingw.org/wiki/FAQ says the flag is O_BINARY (near bottom of page), so you'd need

int fd_in = open(filename, O_RDONLY | O_BINARY, 0644);

http://cygwin.com/faq.html paragraph 5.3 tells you how to handle this in cygwin

Eugen Rieck
  • 64,175
  • 10
  • 70
  • 92
  • Yes, that's definitely the problem. I had assumed that per default open() opens a file in binary mode but on mingw it doesn't. With int fd_in = open(filename, O_RDONLY | O_BINARY, 0644) it works correctly. – Giorgio Jan 05 '12 at 14:56
8

void read_file(const char filename)

and then later:

int fd_in = open(filename, O_RDONLY, 0x00644);

Don't ignore compiler warnings. I am surprised this didn't just crash.

domen
  • 1,819
  • 12
  • 19
  • 1
    In other words, it should have been `const char*` not `const char`. – Jesper Jan 05 '12 at 13:46
  • 1
    Please could you edit your answer to highlight the main point (the type of `filename`), as otherwise it's pretty hard to spot. Thanks! – NPE Jan 05 '12 at 13:46
  • Thanks, it was a typo (in the question, not in the source file). With const char I would have gotten a warning: warning: passing arg 1 of `open' makes pointer from integer without a cast – Giorgio Jan 05 '12 at 13:47
  • 2
    BTW, there is a problem with `mode` too, it must be octal -- 00644, w/o 0x, but O_CREAT is not specified, so it may simply be 0. – actual Jan 05 '12 at 13:50
  • 1
    Then include a copy paste of the problematic source, so you won't waste everyone's time. – domen Jan 05 '12 at 13:50
  • @domen: The problematic source is the one shown above. I have removed the non-relevant parts and made a mistake while copy and pasting. Sorry. – Giorgio Jan 05 '12 at 13:56
  • 1
    @actual: ...or omitted altogether. – DevSolar Jan 05 '12 at 13:58
1

You may want to try using O_RDONLY | O_BINARY or O_RDONLY | O_NOTRANS in the open call. By not specifying O_BINARY or O_NOTRANS, the file may be opened in text mode and the read will stop at the first encounter of the EOF character.

Francois
  • 330
  • 1
  • 7
0

I tried your code on my machine:

  • Windows 7,
  • Cygwin latest version as of today,
  • gcc (GCC) 3.4.4 (cygming special, gdc 0.12, using dmd 0.125))

and it worked fine for a sample file on my machine. The file I read was cmd.exe in C:\Windows\System32 and I compared the total byte count from your read_file function with the actual file size on disk and they matched.

This suggests one of two things:

  • There is something special with the file that you're opening. Maybe it's in a weird locked state and you get some error halfway through (never heard of that though) or maybe the file is corrupt on disk (can other programs access it? Try copying it to another folder)
  • There is something in your code that isn't in the question that is causing the problem
Isak Savo
  • 34,957
  • 11
  • 60
  • 92