3

I'm trying to find a way to measure bytes transferred in or out of a web application built on php+apache. One problem is that all I/O is done by a native PHP extension, which is passed a handle to one of the built-in streams: php://input or php://output.

I have examined the following alternatives:

1.) ftell on stream wrapper After encountering this question, my first intuition was to try using ftell on the stream wrapper handle after the I/O operation; roughly:

$hOutput = fopen('php://output', 'wb');
extensionDoOutput($hOutput);
$iBytesTransferred = ftell($hOutput);

This seems to work for the input wrapper, but not the output (which always returns zero from ftell).

2.) Attach stream filter A non-modifying stream filter would seem like a reasonable way to count bytes passing through. However, the documentation seems a bit lacking and I haven't found a way to get at lengths without doing the iterate+copy pattern as in the example:

class test_filter extends php_user_filter {
  public static $iTotalBytes = 0;

  function filter(&$in, &$out, &$consumed, $closing) {
    while ($bucket = stream_bucket_make_writeable($in)) {
      $consumed += $bucket->datalen;
      stream_bucket_append($out, $bucket);
    }
    test_filter::$iTotalBytes += $consumed;
    return PSFS_PASS_ON;
  }
}

stream_filter_register("test", "test_filter")
    or die("Failed to register filter");

$f = fopen("php://output", "wb");

stream_filter_append($f, "test");

// do i/o

Unfortunately this seems to impose a significant reduction in throughput (>50%) as the data is copied in and out of the extension.

3.) Implement stream wrapper A custom stream wrapper could be used to wrap the other stream and accumulate bytes read/written:

class wrapper {
    var $position;

    var $handle;

    function stream_open($path, $mode, $options, &$opened_path)
    {
        $this->position = 0;
...
        $this->handle = fopen($opened_path, $mode);
        return $this->handle != false;
    }

    function stream_read($count)
    {
        $ret = fread($this->handle, $count);
        $this->position += strlen($ret);
        return $ret;
    }

    function stream_write($data)
    {
        $written = fwrite($this->handle, $data);
        $this->position += $written;
        return $written;
    }

    function stream_tell()
    {
        return $this->position;
    }

    function stream_eof()
    {
        return feof($this->handle);
    }
...
}

stream_wrapper_register("test", "wrapper")
    or die("Failed to register protocol");

$hOutput = fopen('test://output', 'wb');
extensionDoOutput($hOutput);
$iBytesTransferred = ftell($hOutput);

Again, this imposes a reduction in throughput (~20% on output, greater on input)

4.) Output buffering with callback A callback can be provided with ob_start to be called as chunks of output are flushed.

$totalBytes = 0;

function cb($strBuffer) {
   global $totalBytes;
   $totalBytes += strlen($strBuffer);
   return $strBuffer;
}

$f = fopen("php://output", "wb");

ob_start('cb', 16384);
// do output...
fclose($f);
ob_end_flush();

Again, this works but imposes a certain throughput performance penalty (~25%) due to buffering.


Option #1 was forgone because it does not appear to work for output. Of the remaining three, all work functionally but affect throughput negatively due to buffer/copy mechanisms.

Is there something instrinsic to PHP (or the apache server extensions) that I can use to do this gracefully, or will I need to bite the bullet on performance? I welcome any ideas on how this might be accomplished. (note: if possible I am interested in a PHP application-level solution... not an apache module)

Community
  • 1
  • 1
Adam Holmberg
  • 7,245
  • 3
  • 30
  • 53
  • What is the name of the *"native PHP extension"* you write about in your question? – hakre Dec 22 '11 at 17:18
  • I suggest you stick with `ftell` for the input if it works - you're not likely to find anything better, certainly anything that works for both input and output. I would be interested to see a) whether output buffering works when writing to the `php://output` stream (probably should), and b) how much of a performance decrease there is from buffering the whole thing, then doing `$bytesOut = ob_get_length(); ob_end_flush();` - I sort of suspect this will damage performance heavily, but it might be worth trying. – DaveRandom Dec 22 '11 at 17:24
  • @hakre: It's a custom extension. Not publicly available, but also not a candidate for modification (which is why I'm asking at this level). I have examined the interface and there is not a way to get this information from the extension. – Adam Holmberg Dec 22 '11 at 17:26
  • @DaveRandom: thanks for the thoughts. I agree on the input. a.) it does b.) it is not always feasible for this application to buffer the entire output – Adam Holmberg Dec 22 '11 at 17:28

2 Answers2

1

As bizarre as this is, using the STDOUT constant instead of the result of fopen('php://output') makes ftell() work correctly.

$stream = fopen('php://output','w');
fwrite($stream, "This is some data\n");
fwrite($stream, ftell($stream));
// Output:
// This is some data
// 0

However:

fwrite(STDOUT, "This is some data\n");
fwrite(STDOUT, ftell(STDOUT));
// Output:
// This is some data
// 17

Tested PHP/5.2.17 (win32)

EDIT actually, is that working correctly, or should it be 18? I never use ftell() so I'm not 100% sure either way...

ANOTHER EDIT

See whether this suits you:

$bytesOutput = 0;

function output_counter ($str) {
  $GLOBALS['bytesOutput'] += strlen($str);
  return $str;
}
ob_start('output_counter');

$stream = fopen('php://output','w');
fwrite($stream, "This is some data\n");

var_dump($bytesOutput);
DaveRandom
  • 87,921
  • 11
  • 154
  • 174
1

I would stick to the output buffer callback you can just return FALSE to pass through:

class OutputMetricBuffer
{
    private $length;
    public function __construct()
    {
        ob_start(array($this, 'callback'));
    }
    public function callback($str)
    {
        $this->length += strlen($str);
        return FALSE;
    }
    public function getLength()
    {
        ob_flush();
        return $this->length;
    }
}

Usage:

$metric = new OutputMetricBuffer;

# ... output ...

$length = $metric->getLength();

The reasons to use the output buffer callback is because it's more lightweight than a filter which needs to consume all buckets and copy them over. So it's more work.

I implemented the callback inside a class so it has it's own private length variable to count up with.

You can just create a global function as well and use a global variable, however another tweak might be to access it via $GLOBALS instead of the global keyword so PHP does not need to import the global variable into the local symbol table and back. But I'm not really sure if it makes a difference, just another point which could play a role.

Anyway I don't know as well if returning FALSE instead of $str will make it faster, just give it a try.

hakre
  • 193,403
  • 52
  • 435
  • 836