8

Say you have:

def my_func():
    fh = open(...)
    try:
        print fh.read()
    finally:
        fh.close()

My first question is: Is it worth having the try/finally (or with) statement? Isn't the file closed anyway when the function terminates (via garbage collection)?

I came across this after reading a recipe form Martelli's "python cookbook" where

all_the_text = open('thefile.txt').read()

comes with the comment: "When you do so, you no longer have a reference to the file object as soon as the reading operation finishes. In practice, Python notices the lack of a reference at once, and immediately closes the file."

My function example is almost the same. You do have a reference, it's just that the reference has a very short life.

My second question is: What does "immediately" in Martelli's statement mean? Even though you don't have a reference at all, doesn't the file closing happen at garbage collection time anyway?

Simeon Visser
  • 118,920
  • 18
  • 185
  • 180
Ioan Alexandru Cucu
  • 11,981
  • 6
  • 37
  • 39
  • I would not say that your example is similar at all. You are assigning the file to the variable `fh` while `open('thefile.txt').read()` just uses the `read` straight away off the reference when it is created without saving it. I think that's what he means by *immediately* – jamylak Jul 10 '12 at 08:39
  • possible duplicate of [python close file descriptor question](http://stackoverflow.com/questions/4599980/python-close-file-descriptor-question) – jamylak Jul 10 '12 at 09:00

1 Answers1

12

It is good practice to close the file yourself. Using the with statement leads to clean code and it automatically closes the file (which is a Good Thing).

Even though Python is a high-level programming language, you still need to be in control of what you're doing. As a rule of thumb: if you open a file, it also needs to be closed. There's never a good reason to be sloppy in your code :-)

Regarding your second question: it won't run immediately, it'll run when the garbage collector decides it is time to run. When the file object is deallocated Python will close the file. Here are some articles on garbage collection in Python (also see the gc module), it's an interesting read.

It also shows that Python's garbage collection uses a threshold based on the number of allocated and deallocated objects before it decides to garbage collect. If your file is big then Python might hold the file open longer than necessary because the garbage collection code might not have run yet.

Simeon Visser
  • 118,920
  • 18
  • 185
  • 180
  • 2
    Fair enough. What's more intriguing to me is what happens under the hood and what happens once the refcount drops to 0. See my edit (containing the second question regarding Martelli's "immediately" statement). – Ioan Alexandru Cucu Jul 10 '12 at 08:32
  • 2
    You should probably note that CPython uses a mixture of reference counting and garbage collection, so in simple cases, the garbage is collected immediately the last reference is destroyed. – Simon Callan Jul 10 '12 at 08:58