2

I'm trying to pluck values out of many HDF5 files and store in a list.

import h5py
h = [h5py.File('filenum_%s.h5' % (n),'r')['key'][10][10] for n in range(100)]

This list comprehension contains the values at grid point (10, 10) in the 'key' array from the HDF5 files filenum0.h5-filenum99.h5.

It works, except that it stops around the 50th element with the error:
IOError: unable to open file (File accessibilty: Unable to open file)
even though I know the file exists and it can be opened if I haven't opened many other files. I think I get the error because too many files have been opened.

Is there a way to close the files within this list comprehension? Or, is there a more effective way to build the list I want?

blaylockbk
  • 2,503
  • 2
  • 28
  • 43

2 Answers2

5

By doing like you're doing, you don't control when the file is closed.

You can control that, but not with a one-liner. You need an auxiliary method which returns the data, and closes the file (using a context manager is even better as h5py files support that, I just checked)

def get_data(n):
    with h5py.File('filenum_%s.h5' % (n),'r') as f:
        return f['key'][10][10]

then

h = [get_data(n) for n in range(100)]

You could make the get_data function more generic by not hardcoding the 10 & 'key' arguments of course.

Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219
2

For the sake of argument, you could do everything in one single terrible list comprehension like this:

import h5py
h = [(f['key'][10][10], f.close())[0]
     for f in (h5py.File('filenum_%s.h5' % (n),'r') for n in range(100))]

But I would strongly advise against something like that, and prefer instead an auxiliary function or some other approach.

jdehesa
  • 58,456
  • 7
  • 77
  • 121
  • 1
    yes, as building an object for the side effect isn't recommended. But that's cool enough. pylang linked answer is also a good lead: https://stackoverflow.com/a/45929510/4531270 – Jean-François Fabre Aug 30 '17 at 19:41