15

In the tchrists broilerplate i found this explicit closing of STDOUT in the END block.

END { close STDOUT }

I know END and close, but i'm missing why it is needed.

When start searching about it, found in the perlfaq8 the following:

For example, you can use this to make sure your filter program managed to finish its output without filling up the disk:

END {
    close(STDOUT) || die "stdout close failed: $!";
}

and don't understand it anyway. :(

Can someone explain (maybe with some code-examples):

  • why and when it is needed
  • how and in what cases can my perl filter fill up the disk and so on.
  • when things getting wrong without it...
  • etc??
Community
  • 1
  • 1
clt60
  • 62,119
  • 17
  • 107
  • 194

2 Answers2

15

A lot of systems implement "optimistic" file operations. By this I mean that a call to for instance print which should add some data to a file can return successfully before the data is actually written to the file, or even before enough space is reserved on disk for the write to succeed.

In these cases, if you disk is nearly full, all your prints can appear successful, but when it is time to close the file, and flush it out to disk, the system realizes that there is no room left. You then get an error when closing the file.

This error means that all the output you thought you saved might actually not have been saved at all (or partially saved). If that was important, your program needs to report an error (or try to correct the situation, or ...).

All this can happen on the STDOUT filehandle if it is connected to a file, e.g. if your script is run as:

perl script.pl > output.txt

If the data you're outputting is important, and you need to know if all of it was indeed written correctly, then you can use the statement you quoted to detect a problem. For example, in your second snippet, the script explicitly calls die if close reports an error; tchrist's boilerplate runs under use autodie, which automatically invokes die if close fails.

(This will not guarantee that the data is stored persistently on disk though, other factors come into play there as well, but it's a good error indication. i.e. if that close fails, you know you have a problem.)

Community
  • 1
  • 1
Mat
  • 202,337
  • 40
  • 393
  • 406
  • Frankly, this makes little sense to me. When I run the `script.pl > out.txt`, the END block will be called when the script ending anyway. So the forking shell get a signal "child ended" and will close all STD* filehandles associated with the running perl. – clt60 Jun 12 '11 at 10:44
  • the point is that _your_ process (your script) gets the error _before_ it finishes so it can (if it needs to) do something about it. – Mat Jun 12 '11 at 10:46
  • The above has something with the pipes - but i'm not sure what exactly. – clt60 Jun 12 '11 at 10:50
  • @Mat: in the examples are: `END { close STDOUT}` - and nothing more. So no special handling nor anything other. – clt60 Jun 12 '11 at 10:52
  • 3
    @jm666: part of tchrist's boilerplate (where this came from) is `use autodie;` which handles the error – ysth Jun 12 '11 at 10:55
  • i know autodie. The perlfaq show the example without autodie. it is ok. Simply does not make me sense the "optimistic" file-operations answer. but nwm. will continue researching about it. – clt60 Jun 12 '11 at 10:58
  • 2
    @jm666, I've just tried to write with perl to a full disk. Bash does **not** give an error when it closes the file, and it returns zero error code (success), while the writes to the file have not completed! So shell is not what I would rely here. Adding that `END` clause makes perl itself barf the proper warning, and announce the failure by `die`ing. – P Shved Jun 12 '11 at 11:22
  • 2
    @jm666: The point of the `close STDOUT` isn't to close the stream, which would happen anyway when the perl process exits. The point is that **if `close` fails, then the process will exit with a non-zero status code, indicating failure**. That's brought by `use autodie`, or done by `close STDOUT || die` in your other example. – Gilles 'SO- stop being evil' Jun 12 '11 at 12:06
  • @Gilles - YES. That's finally making sense. ;) The whole point is the exit status when the write not finalized successfully. @Pavel - yes, just tried it - the bash got exit status 0 from the perl. @ysth Gilles's bold text help me understand. Thanx all, guys. ;). – clt60 Jun 12 '11 at 12:26
  • But, again one example for something what IMHO should be default in the perl. Why it is not default? (mean, when ending a script defaultly close STDOUT, and only and only in some special cases when it is not advisable - tell to perl "don't close".) Is here something why can't be this default? (or only again because "backward compatibility")? – clt60 Jun 12 '11 at 12:40
  • It is default. All filehandles are closed when a process exits (this has nothing to do with perl actually). What is special is that this construct **gives you the opportunity to do something about an error on close**. It's not special with respect to closing streams. If you care, you should use it. If you don't, you don't need to. – Mat Jun 12 '11 at 12:44
  • 1
    @jm666 Printing an error whenever a close on STDOUT fails at termination should certainly *not* be the default (or, at least, it cannot be made the default now--perhaps it could have been decided that way when the language was young). It is very common to close stdout early on in a program; spurious error reports on termination would be extremely misleading. – William Pursell Jun 12 '11 at 12:53
  • Arghh.. Again I'm said wrong. :( Not mean closing default. Mean get the right exits status when the data is not written. See, In the normal cases, the above END {close STDOUT} SHOULD BE in EACH script, because it is good to know when the write was not successful. So, if it is needed by every script, so why it is not default? ;) – clt60 Jun 12 '11 at 12:56
  • 3
    it is **not** needed by every script. many of scripts just write random information messages to stdout and don't care if it's even connected to anything or not. it is only needed **if** the data being written to that file descriptor is **important**, which is certainly not always the case. – Mat Jun 12 '11 at 12:58
  • @William - yes - youre right. The early close (and failed) does not mean than the whole script going wrong. Right. THANX for the teaching. @Mat - yes - right. ;) thanx. – clt60 Jun 12 '11 at 12:59
  • 3
    The only time I protect the `END{close STDOUT}` is when I have it going to a pipe. Even so sometimes I have a `$SIG{PIPE} = sub {exit 0}` instead In general, it is a bug that filter programs do not detect failures to STDOUT. I have on occasion delayed installation of the atexit handler, as in `eval qq{END {close STDOUT }}` or protected it from `autodie` with `END{eval{ close STDOUT } }`. But when you’ve got STDOUT going to a pipe, per `open(STDOUT, "|less")`, you had certainly best wait on it, or your parent will exit before the child, screwing up output. **By default, assume you need it.** – tchrist Jun 12 '11 at 13:06
  • To all who gave a time to teach me. Guys, bowing. Really thanx for your time - I'm again a bit wiser. ;) – clt60 Jun 12 '11 at 13:08
  • 1
    Furthermore, I have always held it a bug that handles’ implicit close, especially those that have autovivved themselves into existence, do not report errors in the secret, implicit `close`. There’s been talk of adding reportage of the same under `autodie`. 18 of the tools in [my current toolchest](http://training.perl.com/scripts/) use an explicit `close STDOUT`, with some installing it as an `atexit` routine. – tchrist Jun 12 '11 at 13:11
  • @Gilles, `close STDOUT || die` should be `close(STDOUT) or die` if you plan on ever dying. – ikegami Jun 12 '11 at 18:34
  • @ikegami: Nothing wrong with `close STDOUT || die`. It gets `Deparse`'d as `die unless close STDOUT` anyway. – mob Jun 13 '11 at 20:09
4

I believe Mat is mistaken.

Both Perl and the system have buffers. close causes Perl's buffers to be flushed to the system. It does not necessarily cause the system's buffers to be written to disk as Mat claimed. That's what fsync does.

Now, this would happen anyway on exit, but calling close gives you a chance to handle any error it encountered flushing the buffers.

The other thing close does is report earlier errors in attempts by the system to flush its buffers to disk.

ikegami
  • 367,544
  • 15
  • 269
  • 518
  • Yeah - understand. IMHO Mat mean, saving data into the kernel's I/O buffers (they're will be saved in the next "sync" to the hdd). (by the update daemon, or launchd (on OS X)) – clt60 Jun 12 '11 at 18:36