0

I'm working on a benchmark program. Upon making the read() system call, the program appears to hang indefinitely. The target file is 1 GB of binary data and I'm attempting to read directly into buffers that can be 1, 10 or 100 MB in size.

I'm using std::vector<char> to implement dynamically-sized buffers and handing off &vec[0] to read(). I'm also calling open() with the O_DIRECT flag to bypass kernel caching.

The essential coding details are captured below:

std::string fpath{"/path/to/file"};
size_t tries{};
int fd{};
while (errno == EINTR && tries < MAX_ATTEMPTS) {
    fd = open(fpath.c_str(), O_RDONLY | O_DIRECT | O_LARGEFILE);
    tries++;
}

// Throw exception if error opening file
if (fd == -1) {
    ostringstream ss {};
    switch (errno) {
    case EACCES:
        ss << "Error accessing file " << fpath << ": Permission denied";
        break;
    case EINVAL:
        ss << "Invalid file open flags; system may also not support O_DIRECT flag, required for this benchmark";
        break;
    case ENAMETOOLONG:
        ss << "Invalid path name: Too long";
        break;
    case ENOMEM:
        ss << "Kernel error: Out of memory";
    }
    throw invalid_argument {ss.str()};
}

size_t buf_sz{1024*1024};          // 1 MiB buffer
std::vector<char> buffer(buf_sz);  // Creates vector pre-allocated with buf_sz chars (bytes)
                                   // Result is 0-filled buffer of size buf_sz

auto bytes_read = read(fd, &buffer[0], buf_sz);

Poking through the executable with gdb shows that buffers are allocated correctly, and the file I've tested with checks out in xxd. I'm using g++ 7.3.1 (with C++11 support) to compile my code on a Fedora Server 27 VM.

Why is read() hanging on large binary files? Edit: Code example updated to more accurately reflect error checking.

jww
  • 97,681
  • 90
  • 411
  • 885
J. Boley
  • 81
  • 1
  • 9
  • 2
    You don't check whether `open` succeeds. – n. m. could be an AI Mar 22 '18 at 17:15
  • Not in the example, no. My bad. But the checks are made being made in the actual code. I'll add that in. – J. Boley Mar 22 '18 at 17:25
  • open(2) man page: _The O_DIRECT flag may impose alignment restrictions on the length and address of user-space buffers and the file offset of I/Os_ - while I don't think that failure to comply with this would cause `read` to hang, I don't see any attempt to align your buffer and it might cause problems. Are you checking return from open/read? Perhaps read is just failing in a loop. – davmac Mar 22 '18 at 17:29
  • 1
    you cannot read into a vector struct. `read` is a standard `c` concept. Please read in the **array**. You just corrupting memory. – Serge Mar 22 '18 at 17:31
  • _Disk I/O, not so much._ - so, does it hang on files ever? – Maxim Egorushkin Mar 22 '18 at 17:36
  • Is it safe to only test `errno` after `open()`? Isn't `errno` only guaranteed to be set on failure? Wouldn't it be safer to check the return value for `-1`? What if `errno` already has a value before the loop enters? – Galik Mar 22 '18 at 17:36
  • @davmac That could be a problem. I was under the impression that the default allocator performed byte alignment when vectors are created, but come to think of it I'm not sure what that alignment would actually be. – J. Boley Mar 22 '18 at 17:39
  • 1
    @Serge You are mistaken. – Maxim Egorushkin Mar 22 '18 at 17:42
  • @Galik Hmm...I'd assumed that whatever value it would have, would not be EINTR. Seems to work... – J. Boley Mar 22 '18 at 17:43
  • You need to identify on which file exactly it hangs and `strace` it. – Maxim Egorushkin Mar 22 '18 at 17:45
  • I have not used it myself but the way you do that makes me feel like it could infinitely loop. If `errno` already has the value `EINTR` the call to `open()` may not change it even if `open()` succeeds. – Galik Mar 22 '18 at 17:45
  • @Galik Stepping through in gdb, it's not hanging on the open(), but yeah, you're probably right. Bad idea. – J. Boley Mar 22 '18 at 17:58
  • @J.Boley You should check `/proc//task//stack` or just `/proc//stack` to see kernel-level stack trace of your thread/process. If you are indeed reading from `stdin`, you should see something like `tty_read+0x7d/0xe0`. –  Mar 22 '18 at 17:59

3 Answers3

3

There are multiple problems with your code.

This code will never work properly if errno ever has a value equal to EINTR:

while (errno == EINTR && tries < MAX_ATTEMPTS) {
    fd = open(fpath.c_str(), O_RDONLY | O_DIRECT | O_LARGEFILE);
    tries++;
}

That code won't stop when the file has been successfully opened and will keep reopening the file over and over and leak file descriptors as it keeps looping once errno is EINTR.

This would be better:

do
{
    fd = open(fpath.c_str(), O_RDONLY | O_DIRECT | O_LARGEFILE);
    tries++;
}
while ( ( -1 == fd ) && ( EINTR == errno ) && ( tries < MAX_ATTEMPTS ) );

Second, as noted in the comments, O_DIRECT can impose alignment restrictions on memory. You might need page-aligned memory:

So

size_t buf_sz{1024*1024};          // 1 MiB buffer
std::vector<char> buffer(buf_sz);  // Creates vector pre-allocated with buf_sz chars (bytes)
                                   // Result is 0-filled buffer of size buf_sz

auto bytes_read = read(fd, &buffer[0], buf_sz);

becomes

size_t buf_sz{1024*1024};          // 1 MiB buffer

// page-aligned buffer
buffer = mmap( 0, buf_sz, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, NULL );

auto bytes_read = read(fd, &buffer[0], buf_sz);

Note also the the Linux implementation of O_DIRECT can be very dodgy. It's been getting better, but there are still potential pitfalls that aren't very well documented at all. Along with alignment restrictions, if the last amount of data in the file isn't a full page, for example, you may not be able to read it if the filesystem's implementation of direct IO doesn't allow you to read anything but full pages (or some other block size). Likewise for write() calls - you may not be able to write just any number of bytes, you might be constrained to something like a 4k page.

This is also critical:

Most examples of read() hanging appear to be when using pipes or non-standard I/O devices (e.g., serial). Disk I/O, not so much.

Some devices simply do not support direct IO. They should return an error, but again, the O_DIRECT implementation on Linux can be very hit-or-miss.

Andrew Henle
  • 32,625
  • 3
  • 24
  • 56
  • Hmm, would using something like fdatasync() to force the kernel to flush the cache be as performant? If it's as dodgy as you say, I really have to wonder if there side-effects of running this code in a VM. – J. Boley Mar 22 '18 at 18:04
  • 1
    @J.Boley No, going through the page cache wouldn't be as performant. You'd probably be better served doing a `stat()` call on the filename beforehand and using the `O_DIRECT` flag only if it's a normal file. – Andrew Henle Mar 22 '18 at 18:10
  • Memory alignment issues WERE the problem. You and davmac were right on the money. Using mmap to create a byte-aligned buffer really makes the difference. Thanks to everyone who chipped in! – J. Boley Mar 22 '18 at 19:46
  • Now that I've discovered (with guidance :) the impressive mmap() call, I have to wonder...It looks like I could use mmap() twice, once to create a buffer and again to "map" the file I want to read by passing the fd. Is there a performance advantage there over using read()? I suppose I should post this as a separate question. – J. Boley Mar 22 '18 at 19:49
  • @J.Boley Such a `mmap()` question has already been asked: https://stackoverflow.com/questions/258091/when-should-i-use-mmap-for-file-access Read the links in the first comment. In short: `mmap()` for access makes sense in some cases, especially if data is going to be accessed many times. A sequential read of an entire large file is not one of those cases. For such a sequential read operation, `mmap()` is likely slower. `mmap()` is also great when it's fast enough and you want really, really simple code, such as `mmap()` a text file ( + 1 extra byte...) and treating all of it as a C string. – Andrew Henle Mar 22 '18 at 19:59
0

Most examples of read() hanging appear to be when using pipes or non-standard I/O devices (e.g., serial). Disk I/O, not so much.

O_DIRECT flag is useful for filesystems and block devices. With this flag people normally map pages into the user space.

For sockets, pipes and serial devices it is plain useless because the kernel does not cache that data.


Your updated code hangs because fd is initialized with 0 which is STDIN_FILENO and it never opens that file, then it hangs reading from stdin.

Maxim Egorushkin
  • 131,725
  • 17
  • 180
  • 271
  • But OP _is_ reading from a file on a filesystem. Did you read the question? – davmac Mar 22 '18 at 17:34
  • 1
    @davmac More than that, I even quote the question. – Maxim Egorushkin Mar 22 '18 at 17:35
  • @MaximEgorushkin ok, but you've writtten an answer that doesn't address the question that was asked. – davmac Mar 22 '18 at 17:37
  • That's interesting, I'd always assumed that the network stack would cache data received on a socket. Unfortunately, I'm dealing with both and fs and a block device, so O_DIRECT should be fine. – J. Boley Mar 22 '18 at 17:37
  • @J.Boley O_DIRECT doesn't do anything for sockets, it is about bypassing kernels cache. –  Mar 22 '18 at 17:56
0

Pasting your program and running on my linux system, was a working and non-hanging program.

The most likely cause for the failure is the file is not a file-system item, or it has a hardware element which is not working.

Try with a smaller size - to confirm, and try on a different machine to help diagnose

My complete code (with no error checking)

#include <vector>
#include <string>
#include <unistd.h>
#include <stdio.h>
#include <fcntl.h>

int main( int argc, char ** argv )
{
    std::string fpath{"myfile.txt" };
    auto fd = open(fpath.c_str(), O_RDONLY | O_DIRECT | O_LARGEFILE);

    size_t buf_sz{1024*1024};          // 1 MiB buffer
    std::vector<char> buffer(buf_sz);  // Creates vector pre-allocated with buf_sz chars (bytes)
                                       // Result is 0-filled buffer of size buf_sz

    auto bytes_read = read(fd, &buffer[0], buf_sz);
}

myfile.txt was created with

dd if=/dev/zero of=myfile.txt bs=1024 count=1024
  • If the file is not 1Mb in size, it may fail.
  • If the file is a pipe, it can block until the data is available.
mksteve
  • 12,614
  • 3
  • 28
  • 50
  • So it can fail if the file isn't exactly 1 MB? That's a problem...the file is 1 GB, and part of the exercise is to measure performance as block size (bytes read at a time) scales up. – J. Boley Mar 22 '18 at 17:49
  • I wonder if running on a VM is factor. I use VMWare, so I wonder if it's not emulating the physical hardware quite right...Good motivation to test on another host. – J. Boley Mar 22 '18 at 17:52
  • *Pasting your program and running on my linux system, was a working and non-hanging program.* Distribution? Version? Filesystem? Kernel version? Without that information, saying "It worked for me." doesn't help much given the vagaries of Linux `O_DIRECT`. – Andrew Henle Mar 22 '18 at 17:59
  • Your code _should_ include error checking, since that's the critical thing here. – davmac Mar 23 '18 at 12:37