4

when I use this code in IPython3 shell

 >>>data = open('file').read()

and then check open file descriptors:

 lsof | grep file

I find empty list

and when I use this:

>>>open('file')

lsof shows two items. The question is why first operation closes fd while second doesn't? I supposed that garbage collector must delete file object with no refs.

I know about '_' var in interpreter, when I reassign value

>>>111
>>>_
111

but descriptors remain open. when I repeat

>>>open('file')

n times there are 2 * n opened descriptors

adray
  • 1,408
  • 16
  • 20
  • Which Python shell are you using? In the default Python shell, in Python 2.7.3, the file descriptor is released as soon as the second expression, the one that reassigns `_`, is input. – user4815162342 Nov 09 '12 at 20:56
  • No problem; however, I can't repeat it with Python 3.3.0, either. I execute `open('somefile')`, in the other shell `lsof` shows the file to be open. Then I execute `1+1`, and in the shell `lsof` shows the file not to be open. – user4815162342 Nov 09 '12 at 21:04

3 Answers3

4

In the second example the file handle is retained by the interactive interpreter variable _, which allows you to access the last evaluated expression. If you evaluate another expression, such as 1+1, you will note that the file is no longer reported by lsof as open.

As pointed out by Mike Byers, this behavior is specific to CPython, and even then to precise circumstances of how the file object is used. To make sure the file is closed regardless of how the code is executed, use a with statement:

with open('file') as fp:
    data = fp.read()
user4815162342
  • 141,790
  • 18
  • 296
  • 355
2

The default Python implementation uses both garbage collection and reference counting. In your first example the file object's reference count drops to zero and so it is immediately closed even before the garbage collector runs.

The second version is equivalent to this:

_ = open('file')

Since the file is still referenced by _ it is still live until you run another command.

Note that this behaviour is specific to CPython. Other implementations such as IronPython may not be so fast to close the file so you should really close your files when you are finished using them. A nice way of doing this is by using a with statement.

with open('file') as f:
    data = f.read()

Related

Community
  • 1
  • 1
Mark Byers
  • 811,555
  • 193
  • 1,581
  • 1,452
  • Something to add, to point out here, is that in the first instance, the file descriptor object is not returned, the contents of the file is. Since python's interpreter allows you to get the results of the last command via `>>> _` the second instance still has a reference to the file descriptor available. – mklauber Nov 09 '12 at 20:41
2

This is because the interactive interpreter that you are using keeps an implicit reference to the last object returned. That reference is named _.

Python2> open("/etc/hosts") 
<open file '/etc/hosts', mode 'r' at 0xf2c390> 
Python2> _ 
<open file '/etc/hosts', mode 'r' at 0xf2c390>

So it is still "alive" when you look at it. Do something else:

Python2> max(0,1)
1

And the file is now closed as it is no longer referenced.

But this is a good example of why you should explicitly close files that you really want closed.

Keith
  • 42,110
  • 11
  • 57
  • 76