0

Problem that I am trying to solve is the following - I have a long running Python (can take many hours to finish) process that produces up to 80000 HDF5 files. As one of the bottlenecks is constant opening and closing of these files I decided to write a proof-of-concept code that uses a single HDF5 file as output that contains many tables. It certainly helps but I wonder if there is a quick(er) way to export specified tables (with renaming if possible) into a separate file?

DejanLekic
  • 18,787
  • 4
  • 46
  • 77

1 Answers1

1

Yes, there are at least 3 ways to copy the contents of a dataset from one HDF5 file to another. They include:

  1. h5copy command line utility from The HDF Group. You specify source and destination HDF5 files, along with source and destination objects. Likely this does exactly what you want without a lot of coding.
    Ref: HDF Group: H5Copy docs
  2. h5py module has a copy() method for groups and/or datasets. You input source and destination objects.
  3. pytables module (aka tables) has a copy_node() method. A node is a group and/or a dataset. You input source and destination objects.

If you choose to use h5py, there are a couple of relevant posts on SO:

kcw78
  • 7,131
  • 3
  • 12
  • 44