15

I have a simple C++ application which is supposed to read lines from a POSIX named pipe:

#include<iostream>
#include<string>
#include<fstream>

int main() {
    std::ifstream pipe;
    pipe.open("in");

    std::string line;
    while (true) {
        std::getline(pipe, line);
        if (pipe.eof()) {
            break;
        }
        std::cout << line << std::endl;
    }
}

Steps:

  • I create a named pipe: mkfifo in.

  • I compile & run the C++ code using g++ -std=c++11 test.cpp && ./a.out.

  • I feed data to the in pipe:

sleep infinity > in &  # keep pipe open, avoid EOF
echo hey > in
echo cats > in
echo foo > in
kill %1                # this closes the pipe, C++ app stops on EOF

When doing this under Linux, the application successfully displays output after each echo command as expected (g++ 8.2.1).

When trying this whole process on macOS, output is only displayed after closing the pipe (i.e. after kill %1). I started suspecting some sort of buffering issue, so i've tried disabling it like so:

std::ifstream pipe;
pipe.rdbuf()->pubsetbuf(0, 0);
pipe.open("out");

With this change, the application outputs nothing after the first echo, then prints out the first message after the second echo ("hey"), and keeps doing so, alwasy lagging a message behind and displaying the message of the previous echo instead of the one executed. The last message is only displayed after closing the pipe.

I found out that on macOS g++ is basically clang++, as g++ --version yields: "Apple LLVM version 10.0.1 (clang-1001.0.46.3)". After installing the real g++ using Homebrew, the example program works, just like it did on Linux.

I am building a simple IPC library built on named pipes for various reasons, so this working correctly is pretty much a requirement for me at this point.

What is causing this weird behaviour when using LLVM? (update: this is caused by libc++)

Is this a bug?

Is the way this works on g++ guaranteed by the C++ standard in some way?

How could I make this code snippet work properly using clang++?

Update:

This seems to be caused by the libc++ implementation of getline(). Related links:

The questions still stand though.

krispet krispet
  • 1,648
  • 1
  • 14
  • 25
  • The language standard says nothing about the timing of any output. (It probably *should* say a little bit, but not about this multiprocess case.) POSIX does, though, and this does look wrong—what happens if you use DOS newlines (`echo $'foo\r' >in`)? – Davis Herring Apr 03 '19 at 13:20
  • @DavisHerring Yeah, you are right, I probably should have composed my words more precisely. Of course what I've meant is `std::getlines()` not blocking if the stream has a newline available. Trying it with DOS newlines yielded the same results. – krispet krispet Apr 03 '19 at 13:42
  • 3
    Oddly, I see the desired behavior with an older “Apple LLVM version 7.0.2 (clang-700.1.81)” with a `_LIBCPP_VERSION` of 1101. – Davis Herring Apr 04 '19 at 03:20
  • 1
    In the linked bug report I've found this response: "The problem is actually in basic_filebuf<_CharT, _Traits>::underflow(). Around line 595, it calls fread to fill the filebuf's buffer. (by default, 4096 bytes). this read hangs until it can read an entire buffer-full. When it is talking to a file, and the file doesn't have that much data, it gets a short read. When it is talking to a pipe, it just waits until that much data is available." This seems to suggest that this indeed is a bug. When I have time I should probably check the referred code in the two versions that we have tried. – krispet krispet Apr 04 '19 at 06:51
  • Meanwhile I've started using an implementation based on POSIX getline() to get my code working on libc++ and libstdc++ as well, which is suboptimal as I have to mix C and C++ APIs, but it works at least. – krispet krispet Apr 04 '19 at 06:57
  • If you want to make an IPC mechanism, I'd recommend something like UNIX domain sockets on boost ASIO (https://www.boost.org/doc/libs/1_70_0/doc/html/boost_asio/overview/posix/local.html). Reading a "line" at a time using iostreams isn't very flexible or scalable... – cyberbisson Apr 12 '19 at 21:14
  • 1
    I appreciate the suggestion, as it is a good one. In fact I have worked with boost's asyncio before, and yes, it would be an elegant solution for general IPC. However, the question is not "how to build a scalable&flexible IPC solution". As with most projects, I have requirements out of my reach to work with. The question is explicitly about the weird behaviour of `std::getline()` in libc++, not my reasons for reading from named pipes until a delimiter. – krispet krispet Apr 13 '19 at 09:51
  • @krispetkrispet What questions still stand ? you seem to have concluded yourself it's a bug (I think you are correct) - Im not sure it is guarenteed (by who anyway - this is relying on external behavior as well), but it sure is implied if you read the introductions, etc.. How to make it work ? Just work around the bug. I think you can do that yourself - perhaps post the solution here if it's a neat one :) – darune Sep 03 '19 at 08:45
  • @darune It's just weird that the bug report hasn't been touched since 2015, and I was wondering whether the C++ standard says anything regarding the behaviour of `fstream` on pipes. But sure, I've worked around the bug myself a few days after posting this, I'll post my solution so it may help others. – krispet krispet Sep 04 '19 at 11:40

2 Answers2

2

I have worked around this issue by wrapping POSIX getline() in a simple C API and simply calling that from C++. The code is something like this:

typedef struct pipe_reader {
    FILE* stream;
    char* line_buf;
    size_t buf_size;
} pipe_reader;

pipe_reader new_reader(const char* pipe_path) {
    pipe_reader preader;
    preader.stream = fopen(pipe_path, "r");
    preader.line_buf = NULL;
    preader.buf_size = 0;
    return preader;
}

bool check_reader(const pipe_reader* preader) {
    if (!preader || preader->stream == NULL) {
        return false;
    }
    return true;
}

const char* recv_msg(pipe_reader* preader) {
    if (!check_reader(preader)) {
        return NULL;
    }
    ssize_t read = getline(&preader->line_buf, &preader->buf_size, preader->stream);
    if (read > 0) {
        preader->line_buf[read - 1] = '\0';
        return preader->line_buf;
    }
    return NULL;
}

void close_reader(pipe_reader* preader) {
    if (!check_reader(preader)) {
        return;
    }
    fclose(preader->stream);
    preader->stream = NULL;
    if (preader->line_buf) {
        free(preader->line_buf);
        preader->line_buf = NULL;
    }
}

This works well against libc++ or libstdc++.

krispet krispet
  • 1,648
  • 1
  • 14
  • 25
1

As discussed separately, a boost::asio solution would be best, but your question is specifically about how getline is blocking, so I will talk to that.

The problem here is that std::ifstream is not really made for a FIFO file type. In the case of getline(), it is trying to do a buffered read, so (in the initial case) it decides the buffer does not have enough data to reach the delimiter ('\n'), calls underflow() on the underlying streambuf, and that does a simple read for a buffer-length amount of data. This works great for files because the file's length at a point in time is a knowable length, so it can return EOF if there's not enough data to fill the buffer, and if there is enough data, it simply returns with the filled buffer. With a FIFO, however, running out of data does not necessarily mean EOF, so it doesn't return until the process that writes to it closes (this is your infinite sleep command that holds it open).

A more typical way to do this is for the writer to open and close the file as it reads and writes. This is obviously a waste of effort when something more functional like poll()/epoll() is available, but I'm answering the question you're asking.

cyberbisson
  • 382
  • 2
  • 9