1

If I read a file file_data = open(...).read(), I will have "file_data" which refer to data from "read()", and I won't get refer to file a descriptor. It's right? Does this mean that if the file descriptor has 0 links, will it be deleted by the garbage collector? Or file descriptor have 1 link to the opened file and I need to close the file manually?

UPD:

data = open("foo.txt")
# <- brakepoint here

$  lsof foo.txt
COMMAND   PID USER   FD   TYPE DEVICE SIZE/OFF    NODE NAME
python  17249    q    5r   REG    8,3        0 1443322 foo.txt

data = open("foo.txt").read()
# <- brakepoint here

$  lsof foo.txt
-
elektruver
  • 165
  • 8
  • Note that the timing of garbage collection is implementation defined. The language specification makes no guarantees about when or if memory is reclaimed. – MisterMiyagi Feb 08 '20 at 21:51
  • Does this answer your question: https://stackoverflow.com/questions/36046167/is-there-a-need-to-close-files-that-have-no-reference-to-them – MisterMiyagi Feb 08 '20 at 22:30

2 Answers2

3

If you write that code, you need to hope that the file-like object's __del__ method will close the underlying file, because you won't have a reference to do so yourself. Use a with statement instead:

with open(...) as f:
    file_data = f.read()
chepner
  • 497,756
  • 71
  • 530
  • 681
  • Context manager is good choice. But I want to understand how the garbage collector will behave in my example – elektruver Feb 08 '20 at 20:57
  • 2
    @elektruver: The file's `__del__()` method _probably_ closes the file, but there's not guarantee in Python that it will ever be called. Best to close it yourself or use a `with` context manager. When the script ends, the OS will close the file if it's still open. – martineau Feb 08 '20 at 21:10
  • 1
    The garbage collector's behaviour is implementation-specific, and generally unpredictable. You should not rely on it doing anything in particular, except (eventually) collecting any garbage if it needs to. – kaya3 Feb 08 '20 at 21:42
-1

I've tried your example and this what I get:

import gc

gc.disable()
file_data = open('somefile').read()
gc.collect() # The number of unreachable objects found is returned.
>>> 0

I'm not advanced programmer but for me it seems that there is no file descriptor considered as garbage.

Let's create example where we know for sure there will be garbage to collect:

class A:
    def __init__(self):
            self.b = B(self)

class B:
    def __init__(self, a):
            self.a = a

import gc
gc.disable()
my_var = A()
my_var = None
gc.collect()
>>>4 # The number of unreachable objects found is returned.

It seems that gc.collect() indeed returns number of objects to "clean" so It would confirm that my thinking process was ok. But as I said I'm not experienced programmer and maybe I miss or misunderstand something.

Artur
  • 21
  • 1
  • 6
  • 1
    Explicitly calling `gc.collect()` like that may be affecting the results reported…so proves nothing IMO – martineau Feb 08 '20 at 21:12
  • 2
    *"it seems that there is no file descriptor considered as garbage."* That's not necessarily true; since there were no cyclic references it may have been collected automatically when the refcount was decreased, without the GC doing it. – kaya3 Feb 08 '20 at 21:45
  • The question may be specifically about file object—it's unclear in that respect—so I'm not sure that discussing garbage collection in general applies. – martineau Feb 08 '20 at 21:47
  • 1
    The CPython `gc` module is for *cyclic* garbage. Non-cyclic references are purely handled via reference counting and are practically by definition never garbage. A situation as in the question is not handled by `gc`. – MisterMiyagi Feb 09 '20 at 07:54