21

Here's a scenario. You have a large amount of legacy scripts, all using a common library. Said scripts use the 'print' statement for diagnostic output. No changes are allowed to the scripts - they range far and wide, have their approvals, and have long since left the fruitful valleys of oversight and control.

Now a new need has arrived: logging must now be added to the library. This must be done automatically and transparently, without users of the standard library needing to change their scripts. Common library methods can simply have logging calls added to them; that's the easy part. The hard part lies in the fact that diagnostic output from these scripts were always displayed using the 'print' statement. This diagnostic output must be stored, but just as importantly, processed.

As an example of this processing, the library should only record the printed lines that contain the words 'warning', 'error', 'notice', or 'attention'. The below Extremely Trivial and Contrived Example Code (tm) would record some of said output:

sub CheckPrintOutput
{
    my @output = @_; # args passed to print eventually find their way here.
    foreach my $value (@output) {
         Log->log($value) if $value =~ /warning|error|notice|attention/i;
    }
}

(I'd like to avoid such issues as 'what should actually be logged', 'print shouldn't be used for diagnostics', 'perl sucks', or 'this example has the flaws x y and z'...this is greatly simplified for brevity and clarity. )

The basic problem comes down to capturing and processing data passed to print (or any perl builtin, along those lines of reasoning). Is it possible? Is there any way to do it cleanly? Are there any logging modules that have hooks to let you do it? Or is it something that should be avoided like the plague, and I should give up on ever capturing and processing the printed output?

Additional: This must run cross-platform - windows and *nix alike. The process of running the scripts must remain the same, as must the output from the script.

Additional additional: An interesting suggestion made in the comments of codelogic's answer:

You can subclass http://perldoc.perl.org/IO/Handle.html and create your own file handle which will do the logging work. – Kamil Kisiel

This might do it, with two caveats:

1) I'd need a way to export this functionality to anyone who uses the common library. It would have to apply automatically to STDOUT and probably STDERR too.

2) the IO::Handle documentation says that you can't subclass it, and my attempts so far have been fruitless. Is there anything special needed to make sublclassing IO::Handle work? The standard 'use base 'IO::Handle' and then overriding the new/print methods seem to do nothing.

Final edit: Looks like IO::Handle is a dead end, but Tie::Handle may do it. Thanks for all the suggestions; they're all really good. I'm going to give the Tie::Handle route a try. If it causes problems I'll be back!

Addendum: Note that after working with this a bit, I found that Tie::Handle will work, if you don't do anything tricky. If you use any of the features of IO::Handle with your tied STDOUT or STDERR, it's basically a crapshoot to get them working reliably - I could not find a way to get the autoflush method of IO::Handle to work on my tied handle. If I enabled autoflush before I tied the handle it would work. If that works for you, the Tie::Handle route may be acceptable.

Robert P
  • 15,707
  • 10
  • 68
  • 112
  • So what *are* you allowed to change? Command lines? Parameter files? For example, let's say I said "Yeah you can hook print", what's the scope of what you'd be able to do? – Axeman Dec 22 '08 at 23:51
  • I can change anything in the common library. The user must not need to run his scripts any differently, or have to pass anything new in on the command line. The final stream of data on STDOUT and STDERR must be the same as before. – Robert P Dec 23 '08 at 00:01
  • what happens to the original output? can you tail -f it and process from there? – user44511 Dec 23 '08 at 01:36
  • The original output is processed by other programs. They expect it to remain the same as before. Also, as mentioned below in another comment, we don't want to change the environment, either - so masking 'perl' with another program that does the log processing would be problematic. – Robert P Dec 23 '08 at 16:17

5 Answers5

25

There are a number of built-ins that you can override (see perlsub). However, print is one of the built-ins that doesn't work this way. The difficulties of overriding print are detailed at this perlmonk's thread.

However, you can

  1. Create a package
  2. Tie a handle
  3. Select this handle.

Now, a couple of people have given the basic framework, but it works out kind of like this:

package IO::Override;
use base qw<Tie::Handle>;
use Symbol qw<geniosym>;

sub TIEHANDLE { return bless geniosym, __PACKAGE__ }

sub PRINT { 
    shift;
    # You can do pretty much anything you want here. 
    # And it's printing to what was STDOUT at the start.
    # 
    print $OLD_STDOUT join( '', 'NOTICE: ', @_ );
}

tie *PRINTOUT, 'IO::Override';
our $OLD_STDOUT = select( *PRINTOUT );

You can override printf in the same manner:

sub PRINTF { 
    shift;
    # You can do pretty much anything you want here. 
    # And it's printing to what was STDOUT at the start.
    # 
    my $format = shift;
    print $OLD_STDOUT join( '', 'NOTICE: ', sprintf( $format, @_ ));
}

See Tie::Handle for what all you can override of STDOUT's behavior.

Axeman
  • 29,660
  • 2
  • 47
  • 102
  • Oh, this is looking good. Does this change print's behavior if they use print to write to file handles, or just to STDOUT? – Robert P Dec 23 '08 at 16:08
  • Looking at it more. I see - it's tied to a single file handle at a time, right? I will be giving this a whirl shortly and report back. – Robert P Dec 23 '08 at 16:24
  • The "file handle" is just a means. Once we've got the flow directed to where we can specify behavior, we can write out to as many real handles as we please, filter it as we want--even make database calls, if that's what we wanted to do. – Axeman Dec 23 '08 at 17:47
  • Fantastic, this is exactly what I was hoping for : only affecting STDOUT (and probably STDERR too), transparent to the outside. Thanks! – Robert P Dec 23 '08 at 18:41
  • What about re-assigning the STDOUT glob instead of select()'ing it? We've stored it in the module...for all intents and purposes our new tied handle should be STDOUT. It seems kind-of out there, but not beyond workability. Do you know of any major problems with doing something like that? – Robert P Dec 23 '08 at 18:55
  • "We've stored it in the module" it == STDOUT. – Robert P Dec 23 '08 at 18:57
9

You can use Perl's select to redirect STDOUT.

open my $fh, ">log.txt";
print "test1\n";
my $current_fh = select $fh;
print "test2\n";
select $current_fh;
print "test3\n";

The file handle could be anything, even a pipe to another process that post processes your log messages.

PerlIO::tee in the PerlIO::Util module seems to allows you to 'tee' the output of a file handle to multiple destinations (e.g. log processor and STDOUT).

codelogic
  • 71,764
  • 9
  • 59
  • 54
  • select is all well and good, but we need to process that data, too. Is there a file handle type that gives hooks to evaluate and do something with the data passed to it? – Robert P Dec 22 '08 at 23:32
  • Also, we cannot modify the current behavior of these scripts, so the output must remain on STDOUT as well. – Robert P Dec 22 '08 at 23:33
  • You can open a filehandle as a pipe to another process. Do the logging stuff in there, then you can print to stdout from inside it. –  Dec 22 '08 at 23:38
  • You can have the piped process redirect its STDIN to STDOUT in addition to processing it, that way current behavior will not be affected. I'm not aware of any modules that abstract this functionality. – codelogic Dec 22 '08 at 23:39
  • Interesting idea... would that be with something like fork()? There's another requirement too...this must run across platforms: windows and *nix alike. As I understand it, fork doesn't really work in windows. Or are you talking about a different piping mechanism? – Robert P Dec 22 '08 at 23:44
  • A pipe is not a fork. It creates a new process and provides you with a handle to its STDIN. Read more about it @ http://www.troubleshooters.com/codecorn/littperl/perlfile.htm#Piping – codelogic Dec 22 '08 at 23:56
  • You can subclass http://perldoc.perl.org/IO/Handle.html and create your own file handle which will do the logging work. – Kamil Kisiel Dec 22 '08 at 23:58
  • re: codelogic-I meant would you create the pipe and then start the other process with fork. If not fork, what would you do to start the process? qx? re: Kamil: Interesting. This would then require that they include that subclass of IO::Handle, right? Could it be re-exported automatically? – Robert P Dec 23 '08 at 00:09
  • @Robert: No, you don't have to manually fork. Perl takes care of creating the other process and establishing the connection between the file handle and the other process' STDIN. It's non-blocking. – codelogic Dec 23 '08 at 00:30
7

Lots of choices. Use select() to change the filehandle that print defaults to. Or tie STDOUT. Or reopen it. Or apply an IO layer to it.

ysth
  • 96,171
  • 6
  • 121
  • 214
3

This isn't the answer to your issue but you should be able to adopt the logic for your own use. If not, maybe someone else will find it useful.

Catching malformed headers before they happen...

package PsychicSTDOUT;
use strict;

my $c = 0;
my $malformed_header = 0;
open(TRUE_STDOUT, '>', '/dev/stdout');
tie *STDOUT, __PACKAGE__, (*STDOUT);

sub TIEHANDLE {
    my $class = shift;
    my $handles = [@_];
    bless $handles, $class;
    return $handles;
}

sub PRINT {
    my $class = shift;
    if (!$c++ && @_[0] !~ /^content-type/i) {
        my (undef, $file, $line) = caller;
        print STDERR "Missing content-type in $file at line $line!!\n";
        $malformed_header = 1;
    }
    return 0 if ($malformed_header);
    return print TRUE_STDOUT @_;
}
1;

usage:

use PsychicSTDOUT;
print "content-type: text/html\n\n"; #try commenting out this line
print "<html>\n";
print "</html>\n";
Andrew Sohn
  • 675
  • 1
  • 6
  • 15
-1

You could run the script from a wrapper script that captures the original script's stdout and writes the output somewhere sensible.

Greg Hewgill
  • 951,095
  • 183
  • 1,149
  • 1,285
  • Unfortunately, this violates the 'script must run the same as before' requirement. The user must not have to do anything different to get this logging information. – Robert P Dec 22 '08 at 23:41
  • True enough from a general implementation standpoint; a wrapper script is one possible solution. However, asking the users to run another script just to run their script not a solution that will work in this scenario - the command line, perl distro, and console output must remain the same. – Robert P Dec 23 '08 at 16:14
  • And it must ONLY affect scripts that use this particular library. If I changed what the shell thought 'perl' was, it would affect other scripts that have no need for the common library. That too, is also not an acceptable solution for this application. – Robert P Dec 23 '08 at 16:15
  • But depending on yours, it might be. :) – Robert P Dec 23 '08 at 17:31