1

In docstring of numpy.load() I have found the following warning:

For .npz files, the returned instance of NpzFile class must be closed to avoid leaking file descriptors.

I noticed, that the returned NpzFile object has both __enter__() and __exit__() methods.

Would it take care of closing it automatically if i use it like this:

>>> with numpy.load('my_mile.npz') as data:
...     A = data['A']

?

abukaj
  • 2,582
  • 1
  • 22
  • 45

2 Answers2

2

Yes. Using a with statement will close the file-like object. Here's an example, directly from the documentation:

with load('foo.npz') as data:
    a = data['a']
Zach Gates
  • 4,045
  • 1
  • 27
  • 51
  • Shame on me, I missed it. Thanks! :-) – abukaj Aug 24 '17 at 17:57
  • 1
    If I use this, I still see the `data` object after the context. I can use its methods (like `.keys()`) and get the correct output, but if I want to access the actual array with `data[a]` it prints some internal numpy error. Is it intended behavior to be able to access the object like that? – clemisch Jan 24 '18 at 13:39
  • @clemisch See accepted answer to: https://stackoverflow.com/questions/6432355/variable-defined-with-with-statement-available-outside-of-with-block – abukaj Feb 28 '18 at 11:33
2

Short Answer

Yes, it would close the file object automatically after the context ends since an NpzFile object has both __enter__() and __exit__() methods (see here).


Long Answer

After the closing of the scope of the with expression as var context manager, the var still remains as an object outside the context manager. However, in case of numpy.load(), the file descriptor var is not accessible outside the scope of the context manager. Consider the following example:

# Creating a dictionary of data to be saved using numpy.savez
data_dict = {'some_string': 'StackOverflow',
'some_integer': 10000,
'some_array': numpy.array([0,1,2,3,4])
}

# Saving the data
numpy.savez(file='./data_dict.npz', **data_dict)

# Loading the 'data_dict' using context manager
with numpy.load('data_dict.npz') as dt:
    string_ = dt['some_string']
    integer_ = dt['some_integer']
    array_ = dt['some_array']
# OR
with numpy.load('data_dict.npz') as dt:
    dt_ = dict(dt) # if you want the entire dictionary to be loaded as is

If you now attempt to access the file descriptor outside the context manager, it will simply return the NpzFile object with its memory address as below:

>>> dt
Out[]: <numpy.lib.npyio.NpzFile at 0x7ffba63bb7c0>

However, as should be expected, you will not be able to access any of its properties or attributes. You get AttributeError, for instance, when you do:

>>> dt['some_string']
Out[]: Traceback (most recent call last):
.
.
File ".../site-packages/numpy/lib/npyio.py", line 249, in __getitem__
    bytes = self.zip.open(key)
AttributeError: 'NoneType' object has no attribute 'open'

This is because, after the end of the with context manager, the NpzFile object's self.zip variable gets assigned None value (see def close(self): in the first URL above, which gets called in the dunder __exit__())

NOTE 1: dt.keys() returns (as expected) a KeysView object and performing list(dt.keys()) gives you a list of the keys' names of dt: ['some_string', 'some_integer', 'some_array']. However, one still cannot access the values (outside the scope of the context manager) that were stores on these keys (inside the scope of the context manager).

NOTE 2: I have deliberately used a dictionary containing non-numpy-array values just to show that it is possible to store such dictionaries using numpy.savez(). However, this is not a recommended method of storing such data.

TAH
  • 93
  • 1
  • 7