5

I'm reading streams in PHP, using proc_open and fgets($stdout), trying to get every line as it comes in.

Many linux programs (package managers, wget, rsync) just use a CR (carriage return) character for lines which periodically updates "in place", like download progress. I'd like to catch these updates (as separate lines) as soon as they happen.

At the moment, fgets($stdout) just keeps reading until a LF, so when progress is going very slowly (big file for example) it just keeps on reading until it's completely done, before returning all the updated lines as one long string, including the CRs.

I've tried setting the "mac" option to detect CRs as line endings:

ini_set('auto_detect_line_endings',true); 

But that doesn't seem to work.

Now, stream_get_line would allow me to set CRs as line breaks, but not a "catch all" solution which treats both CRLF, CR and LF as delimiters.

I could of course read the whole line, split it using various PHP methods and replace all types of linebreaks with LFs, but it's a stream, and I want PHP to be able to get an indication of progress while it's still running.

So my question:

How can I read from the STDOUT pipe (from proc_open) until a LF or CR happens, without having to wait until the whole line is in?

Thanks in advance!

Solution:

I used Fleshgrinder's filter class to replace \r with \n in the stream (see accepted answer), and replaced fgets() with fgetc() to get more "realtime" access to contents of STDOUT:

$stdout = $proc->pipe(1);
stream_filter_register("EOL", "EOLStreamFilter");
stream_filter_append($stdout, "EOL"); 

while (($o = fgetc($stdout))!== false){
    $out .= $o;                            // buffer the characters into line, until \n.
    if ($o == "\n"){echo $out;$out='';}    // can now easily wrap the $out lines in JSON
}
okdewit
  • 2,406
  • 1
  • 27
  • 32
  • 2
    Are you sure that this isn't simply i/o caching? Many filesystems don't write to disk on a byte by byte basis, but cache the writes until a suitable number of blocks is available to be physically written in one go; likewise in many cases with buffered output – Mark Baker Jan 12 '15 at 18:11
  • But if I run something like `wget server.com/file` on the CLI, it does output the updates directly, like how it goes from the screen-wide `1%[==> ]` line to `2%[==> ]` as soon as it happens. Or is reading from the STDOUT pipe in PHP fundamentally different from what the terminal shows? – okdewit Jan 12 '15 at 18:20

1 Answers1

2

Use a stream filter to normalize your new line characters before consuming the stream. I created the following code that should do the trick based on the example from PHP’s manual page on stream_filter_register.

Code is untested!

<?php

// https://php.net/php-user-filter
final class EOLStreamFilter extends php_user_filter {

    public function filter($in, $out, &$consumed, $closing)
    {
        while ($bucket = stream_bucket_make_writeable($in)) {
            $bucket->data = str_replace([ "\r\n", "\r" ], "\n", $bucket->data);
            $consumed += $bucket->datalen;
            stream_bucket_append($out, $bucket);
        }
        return PSFS_PASS_ON;
    }

}

stream_filter_register("EOL", "EOLStreamFilter");

// Open stream …

stream_filter_append($yourStreamHandle, "EOL");

// Perform your work with normalized EOLs …

EDIT: The comment Mark Baker posted on your question is true. Most Linux distributions are using a line buffer for STDOUT and it is possible that Apple is doing the same. On the other hand most STDERR streams are unbuffered. You could try to redirect the output of the program to another pipe (e.g. STDERR or any other) and see if you have more luck with that.

Fleshgrinder
  • 15,703
  • 4
  • 47
  • 56
  • Just tested your class extension, and it *works* for replacing CRs with LFs! Great trick to know :) Line buffering of STDOUT still seems to be a problem, so I'm gonna try piping it elsewhere to see what happens. But at least the output shows up with linebreaks in html
     tags now (I'm using ajax progress events with PHP buffering turned off), so the filtering works.   
    
    N.B. Also, weirdly enough, the function "stream_bucket_make_writable" should be "stream_bucket_make_writ**e**able", in contrast to language style guides and the other PHP functions with the word writable in it.
    – okdewit Jan 12 '15 at 20:03
  • 1
    Good to know, let me know if redirecting helps with buffering. XD Typical for PHP! – Fleshgrinder Jan 12 '15 at 20:17
  • 1
    Found the solution! After some testing, it seems that STDOUT buffering is not an issue, it spits out directly into PHP. The filter also just offers a partial solution: It replaces \r, but still after the line has been read. So I encoutered the fgetc() function, which reads all **bytes** as a stream instead of the lines. Your filter class still wonderfully sanitizes the stream, converting \r into \n. This way I can implement the buffering myself -- just append a string until it sees a newline, wrap it as JSON (for metadata) and echo! I'll add the solution to my question after some sleep. – okdewit Jan 12 '15 at 21:20
  • 1
    And thanks a lot, I learned so much from your comment, really pushed me in the right direction! – okdewit Jan 12 '15 at 21:22