Problem that I am trying to solve is the following - I have a long running Python (can take many hours to finish) process that produces up to 80000 HDF5 files. As one of the bottlenecks is constant opening and closing of these files I decided to write a proof-of-concept code that uses a single HDF5 file as output that contains many tables. It certainly helps but I wonder if there is a quick(er) way to export specified tables (with renaming if possible) into a separate file?
Asked
Active
Viewed 368 times
1 Answers
1
Yes, there are at least 3 ways to copy the contents of a dataset from one HDF5 file to another. They include:
h5copy
command line utility from The HDF Group. You specify source and destination HDF5 files, along with source and destination objects. Likely this does exactly what you want without a lot of coding.
Ref: HDF Group: H5Copy docs- h5py module has a
copy()
method for groups and/or datasets. You input source and destination objects. - pytables module (aka tables) has a
copy_node()
method. A node is a group and/or a dataset. You input source and destination objects.
If you choose to use h5py
, there are a couple of relevant posts on SO:

kcw78
- 7,131
- 3
- 12
- 44