0

So I was doing some research on the File class in Ruby. As I was digging I learned that File was a subclass of IO. To my understanding when you create an IO object (or File object), a buffer is opened to that file that allows you to read and write to that file. I don't completely understand what a buffer is, but apparently it stays open until you call the #close method on the object. To my understanding this buffer is opened whether you call File.new or File.open (please correct me if I'm wrong on any of this).

So say you like to use the File class for paths and stuff like this:

f = File.new('spec/tmp/testfile.md')
File.basename(f)

But you never call f.close. Does leaving this buffer open leak memory? If I called this several hundred times for a tree in a filesystem would I be in deep trouble?

Thanks for your replies!

PS I know you can just use File.basename('spec/tmp/testfile.md') instead, I'm just using this as an example

webdesserts
  • 1,003
  • 8
  • 22
  • You close files to release file pointers. And to avoid accidentally overwriting data. Not for leaky memory. – vgoff Nov 13 '12 at 02:58
  • 2
    http://stackoverflow.com/questions/4795447/rubys-file-open-and-the-need-for-f-close. But close them anyway. – numbers1311407 Nov 13 '12 at 02:59
  • thanks @numbers1311407 It seems there's a lot more I need to learn about I/O. Do y'all know any good reads on the subject or "file pointers" specifically? – webdesserts Nov 13 '12 at 03:09
  • No it doesn't. Ruby will GC it, and it will release the underlying structures. – Linuxios Nov 13 '12 at 03:27
  • It doesn't account for the lag between GC action, though, which is not guanteed in a timely manner. And as far as accidentally overwriting data, GC has nothing to do with that. – vgoff Nov 13 '12 at 19:06

1 Answers1

1

Yes

Except for the sys* family of operations, Ruby's IO ops ultimately allocate both file descriptors and buffers.

If you don't close the IO object then you are correct ... you most likely leak both the fd and the buffer.

Now, if you allocate it in such a way as to overwrite or otherwise end the lifetime of the old reference, then Ruby can g/c the entire object. This will definitely free the buffer, and it will eventually free the FD as well.

In all languages, however, it's considered quite bad practice to rely upon a g/c-triggered finalizer as it's unpredictable how long it will take and how many outstanding OS-level resources will exist at one time. You may exceed some local limit before the g/c machinery even starts up.

The general rule is to allocate and free OS resources synchronously.


And as long as I'm beating the subject to death, there is an exception. If you are allocating a fixed number of descriptors or something, and they all must exist at once anyway, and the program is going to exit after finishing its work, then it's OK to just leave them. The OS cleans up everything. For example, it's best not to free memory right before exit. The processing needed to manage the heap is completely wasted if the program is about to exit. The OS is just going to put every single page of the program on its free list. And there is an exception to the exception. If it's homework, I would free everything.

DigitalRoss
  • 143,651
  • 25
  • 248
  • 329