52

Is it bad practice to do the following and not explicitly handle a file object and call its close() method?

for line in open('hello.txt'):
    print line

NB - this is for versions of Python that do not yet have the with statement.

I ask as the Python documentation seems to recommend this :-

f = open("hello.txt")
try:
    for line in f:
        print line
finally:
    f.close()

Which seems more verbose than necessary.

Dave M.
  • 521
  • 1
  • 4
  • 3

8 Answers8

85

Close is always necessary when dealing with files, it is not a good idea to leave open file handles all over the place. They will eventually be closed when the file object is garbage collected but you do not know when that will be and in the mean time you will be wasting system resources by holding to file handles you no longer need.

If you are using Python 2.5 and higher the close() can be called for you automatically using the with statement:

from __future__ import with_statement # Only needed in Python 2.5
with open("hello.txt") as f:
    for line in f:
        print line

This is has the same effect as the code you have:

f = open("hello.txt")
try:
    for line in f:
        print line
finally:
    f.close()

The with statement is direct language support for the Resource Acquisition Is Initialization idiom commonly used in C++. It allows the safe use and clean up of all sorts of resources, for example it can be used to always ensure that database connections are closed or locks are always released like below.

mylock = threading.Lock()
with mylock:
    pass # do some thread safe stuff
Tendayi Mawushe
  • 25,562
  • 6
  • 51
  • 57
  • 2
    in small, short-running scripts it may be prudent to trade off clarity of the code against having one open file hanging arodund for a short time. –  Dec 02 '09 at 21:49
  • 14
    The `with` statement is simple enough that it's hard to justify not using it in the interest of clarity. – Robert Rossney Dec 02 '09 at 22:30
  • Sometimes your file open pattern isn't implementable using 'with'. For example, I'm writing a script right now where I open about a dozen files, read from stdin in, send each row to the appropriate file, tracking each outfile's row count, and at 5000, closing it and opening a new one with an incremented suffix. – odigity Jan 24 '13 at 04:33
21

Actually, the file will be closed when it is garbage collected. See this question for more on how that works.

It is still recommended that you use a try/finally block or a with statement though. If there is an exception when using one of the file object's methods, a reference will be stored in the traceback (which is stored as a global variable) until you clear it or another exception occurs.

Thus, it's bad to rely on garbage collection to close your file for you.

Also, if you've written to the file, you can't guarantee that the changes will be saved to the file until it is closed or flushed.

Jason Baker
  • 192,085
  • 135
  • 376
  • 510
  • 2
    +1 rest of the answers are just quoting the docs, you gave a good explanation for using try/finally. – Sushant Dec 03 '09 at 05:14
13

Strange that for all the discussion in this topic of the importance of freeing system resources, nobody has mentioned what seems to me an obviously more significant reason to close a file deterministically: so that it can be opened again.

There are certainly cases where it doesn't matter. If a file object goes out of scope or gets deleted, the underlying file will get closed. (When it gets closed depends on the specific implementation of Python you're using.) That's generally going to be good enough - if you know exactly when the file variable is going to go out of scope, and if you know that you don't care if the file gets closed deterministically.

But why should you even be troubling yourself with that kind of analysis when the with statement exists?

Robert Rossney
  • 94,622
  • 24
  • 146
  • 218
7

It's kind of hinted at all over the place, but to make it the most clear, yes, you need to close that file. In Python 2.5 (using future) and in Python 2.6, you no longer need the wordy version:

from __future__ import with_statement
with open("hello.txt") as f:
    for line in f:
        print line
Douglas Mayle
  • 21,063
  • 9
  • 42
  • 57
6

Upon exit the Python interpreter (or the kernel in the event of a crash) will close the file, but it's still a good practice to close them when you don't need them. For 1 or 2, or 10, files it might not be a problem, but for more it might bring the whole system down.

Most importantly, it is a sign that the person who wrote the code actually cares about his work.

Emil Ivanov
  • 37,300
  • 12
  • 75
  • 90
  • 2
    Actually, you don't even have to wait for the process to stop. It will be closed when garbage collected. See my answer for more details. – Jason Baker Dec 02 '09 at 12:32
6

No, I don't believe the longer idiom is necessary, and here is why:

I grepped /usr/lib/python2.6/ for the pattern 'for\s+.*\s+in\s+open\(' and found many examples of

for line in open('hello.txt'):
    print line

and so far zero instances of

f = open("hello.txt")
try:
    for line in f:
        print line
finally:
    f.close()

See below for the list of files in the standard library that use the for ... in open idiom.

This naturally leads to the question: If the Python developers accept the shorter idiom in the standard libraries, how can we be improving anything by using something different in our own code if our code depends on standard libraries?

I think the answer is, the longer idiom does not improve anything.

I also ran

#!/usr/bin/env python
try:
    for i,line in enumerate(open('a')):
        print line
        raw_input()
        if i==5:
            break
except Exception:
    pass

raw_input()

and checked /proc/PID/fd for when the file descriptor was closed. It appears that when you break out of the for loop, the file is closed for you.

On the basis of these experiments, I do not believe the long try...finally...close idiom is necessary.

Here is the result of the grep:

/usr/lib/python2.6/dist-packages/NvidiaDetector/nvidiadetector.py:89:tempList = [ x.strip() for x in open(obsolete).readlines() ]
/usr/lib/python2.6/dist-packages/rpy_io.py:49:for line in open(file).readlines():
/usr/lib/python2.6/dist-packages/setuptools/command/easy_install.py:1376:for line in open(self.filename,'rt'):
/usr/lib/python2.6/dist-packages/GDebi/DscSrcPackage.py:47:for line in open(file):
/usr/lib/python2.6/dist-packages/aptsources/distinfo.py:220:[x.strip() for x in open(value)])
/usr/lib/python2.6/dist-packages/DistUpgrade/DistUpgradeCache.py:989:for line in open("/proc/mounts"):
/usr/lib/python2.6/dist-packages/DistUpgrade/DistUpgradeAufs.py:100:for line in open("/proc/mounts"):
/usr/lib/python2.6/dist-packages/DistUpgrade/DistUpgradeAufs.py:205:for line in open("/proc/mounts"):
/usr/lib/python2.6/dist-packages/DistUpgrade/distinfo.py:220:[x.strip() for x in open(value)])
/usr/lib/python2.6/dist-packages/DistUpgrade/DistUpgradeViewKDE.py:826:for c in open(sys.argv[2]).read():
/usr/lib/python2.6/dist-packages/DistUpgrade/DistUpgradeConfigParser.py:45:items = [x.strip() for x in open(p)]
/usr/lib/python2.6/dist-packages/DistUpgrade/DistUpgradeQuirks.py:684:for line in open(cpuinfo):
/usr/lib/python2.6/dist-packages/DistUpgrade/DistUpgradeQuirks.py:692:for line in open("/proc/mounts"):
/usr/lib/python2.6/dist-packages/DistUpgrade/DistUpgradeQuirks.py:726:for line in open("/etc/fstab"):
/usr/lib/python2.6/dist-packages/DistUpgrade/DistUpgradeQuirks.py:762:for line in open(fstab):
/usr/lib/python2.6/dist-packages/DistUpgrade/DistUpgradeQuirks.py:801:for line in open("/etc/fstab"):
/usr/lib/python2.6/dist-packages/DistUpgrade/DistUpgradeQuirks.py:874:for line in open(XORG):
/usr/lib/python2.6/dist-packages/DistUpgrade/DistUpgradeQuirks.py:939:for line in open(os.path.join(modaliasesdir,filename)):
/usr/lib/python2.6/dist-packages/DistUpgrade/DistUpgradeController.py:1307:for line in open(template):
/usr/lib/python2.6/dist-packages/DistUpgrade/xorg_fix_proprietary.py:23:for raw in open(xorg_source):
/usr/lib/python2.6/dist-packages/DistUpgrade/xorg_fix_proprietary.py:58:for line in open(xorg):
/usr/lib/python2.6/dist-packages/DistUpgrade/xorg_fix_proprietary.py:82:for line in open(xorg):
/usr/lib/python2.6/dist-packages/jockey/oslib.py:377:for line in open(self.apt_jockey_source):
/usr/lib/python2.6/dist-packages/jockey/oslib.py:393:for line in open(f):
/usr/lib/python2.6/dist-packages/jockey/backend.py:651:for line in open(path):
/usr/lib/python2.6/dist-packages/jockey/detection.py:277:for line in open(alias_file):
/usr/lib/python2.6/dist-packages/jockey/detection.py:597:for l in open(os.path.join(path, 'uevent')):
/usr/lib/python2.6/dist-packages/apt/cdrom.py:83:for line in open(fname):
/usr/lib/python2.6/dist-packages/problem_report.py:1119:for line in open('/proc/mounts'):
/usr/lib/python2.6/dist-packages/apport/packaging_impl.py:128:for line in open(f):
/usr/lib/python2.6/dist-packages/apport/packaging_impl.py:190:for line in open(sumfile):
/usr/lib/python2.6/dist-packages/apport/packaging_impl.py:641:for l in open('/etc/apt/sources.list'):
/usr/lib/python2.6/dist-packages/apport/hookutils.py:190:for line in open('/proc/asound/cards'):
/usr/lib/python2.6/dist-packages/apport/hookutils.py:290:for line in open('/var/log/syslog'):
/usr/lib/python2.6/dist-packages/apport/hookutils.py:493:mods = [l.split()[0] for l in open(module_list)]
/usr/lib/python2.6/dist-packages/softwareproperties/SoftwareProperties.py:597:for line in open(f):
/usr/lib/python2.6/dist-packages/softwareproperties/gtk/SoftwarePropertiesGtk.py:883:for x in open(tmp.name):
/usr/lib/python2.6/dist-packages/lsb_release.py:253:for line in open('/etc/lsb-release'):
/usr/lib/python2.6/dist-packages/numpy/distutils/system_info.py:815:for d in open(ld_so_conf,'r').readlines():
/usr/lib/python2.6/dist-packages/LanguageSelector/LocaleInfo.py:72:for line in open(languagelist_file):
/usr/lib/python2.6/dist-packages/LanguageSelector/LocaleInfo.py:187:for line in open(environment).readlines():
/usr/lib/python2.6/dist-packages/LanguageSelector/LocaleInfo.py:193:for line in open(environment).readlines():
/usr/lib/python2.6/dist-packages/LanguageSelector/LanguageSelector.py:125:for line in open(fname):
/usr/lib/python2.6/dist-packages/LanguageSelector/LanguageSelector.py:140:for line in open(fname):
/usr/lib/python2.6/dist-packages/LanguageSelector/LanguageSelector.py:171:for line in open(fname):
/usr/lib/python2.6/dist-packages/LanguageSelector/LanguageSelector.py:210:for line in open(fname):
/usr/lib/python2.6/dist-packages/LanguageSelector/macros.py:16:for l in open(file):
/usr/lib/python2.6/dist-packages/LanguageSelector/macros.py:37:for l in open(self.LANGCODE_TO_LOCALE):
/usr/lib/python2.6/dist-packages/LanguageSelector/LangCache.py:94:for l in open(self.BLACKLIST):
/usr/lib/python2.6/dist-packages/LanguageSelector/LangCache.py:99:for l in open(self.LANGCODE_TO_LOCALE):
/usr/lib/python2.6/dist-packages/LanguageSelector/LangCache.py:111:for l in open(self.PACKAGE_DEPENDS):
/usr/lib/python2.6/dist-packages/LanguageSelector/ImSwitch.py:78:for l in open(self.blacklist_file):
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • 2
    This analysis is an interesting, but it's obviously incomplete. First, it doesn't pass the smell test: if there are only 52 places in the Python standard library where a file gets opened, I will cook and eat my shoe. For a trivial example, the `logging` module implements its own `open` and `close` methods, which your RE won't find. Also: it implements a `close` method. There's a reason. – Robert Rossney Dec 02 '09 at 22:28
  • CPython uses refcounting and will close these files deterministically. It makes sense that the library shipped with CPython is able to rely on CPython-specific behavior. –  Dec 03 '09 at 00:16
  • 1
    In the same vein, you didn't see leaked resources in your test because you didn't keep trackback objects hanging around, or other ways to keep references to the now-unneeded-but-open files---and CPython's current refcounting closes them deterministically for you. –  Dec 03 '09 at 00:20
  • 1
    @Robert Rossney: Thanks for your comment. The above is not a list of all places files gets opened. It is only a list of places where my grep found the `for ... in open(...)` idiom. I don't follow what you are saying regarding the logging module. It does not use short idiom, but neither does it use the long idiom. Could you explain some more how logging is relevant? – unutbu Dec 03 '09 at 01:17
  • @Roger Pate: Thanks to you too. Indeed, what I grepped only applies to CPython. And I agree that if you do something unusual with traceback, it might prevent the file descriptor from ever closing. Could you show some code of how this might happen? I think it would help to see code that demonstrates the problem with the short idiom. Is it possible for the user of traceback to `del` the file object manually? Might that be a better solution and having to use the long idiom everywhere? – unutbu Dec 03 '09 at 01:18
  • In the short idiom the file object goes unnamed, so how one might muck with it in a traceback frame is beyond me. This is also why I believe the short idiom is safe. The file object exists only in the context of the for-loop. Once the for-loop ends naturally, or terminates due to a break or exception, there are no more references to the file object so it is free to be garbage collected. – unutbu Dec 03 '09 at 04:00
3

Yes, because otherwise you might leak resources.

From the Python docs:

When you’re done with a file, call f.close() to close it and free up any system resources taken up by the open file.

This will happen for you when the program exits, but otherwise Python is keeping around resources it no longer needs up to that point.

Dominic Rodger
  • 97,747
  • 36
  • 197
  • 212
2

You need to close handles so that memory is freed up. Not really needed until dealing with a lot of files at a time.

Sarfraz
  • 377,238
  • 77
  • 533
  • 578
  • It's still generally considered good practice to close the file. I suppose it won't really harm anything to leave one or two files open, but it's not *that* much work to close it. – Jason Baker Dec 02 '09 at 12:33