6

I started working with HDF file format on Python a few weeks ago, and first thing you realize when doing this is that there are two main libraries that are both great though slightly different: Pytables (which works well with ViTables tool for visualization) and h5py (which works well with HDFView and HDFCompass).

Sadly, this two libraries don't behave well together. I read in several places (like here or here) that the idea is to make them compatible by "placing" one on top the other in this way:

This was discussed at SciPy 2015, so my question is: does anyone know how is this going? What is the current situation?

iipr
  • 1,190
  • 12
  • 17
  • This may not be the right place to ask that kind of question. It sounds more like something that you'd find on the `issues` or developers discussions for the respective packages. That closest I've come to looking at the interaction of `h5py` and `tables` is this question: http://stackoverflow.com/questions/41173254/how-should-i-use-h5py-lib-for-storing-time-series-data/41174767#41174767. I was able to read a `pandas.tohdf` file with `h5py`. – hpaulj Apr 11 '17 at 16:19
  • Yeah... I doubted about asking on some GitHub issue but I didn't find any _appropriate_ one. I just don't know where to look... So thanks for the comment. About what you say of `pandas.tohdf`, it is true, but when you open a dataframe saved like this with `h5py` you see that it was stored as a group with several datasets inside (this is because of `tables`). This can be check with code or with HDFView. Even more, HDFCompass is not able to open files saved in this way I think. – iipr Apr 11 '17 at 16:57
  • What do you mean by "_don't behave well together_"? Do you have a workflow that requires multiple packages? I have used all of them to read and/or create HDF5 data, but can't think of a time when I needed 2 packages in the same program. In other words, I either use h5py, OR pytables, OR pandas. I have created files with pytables (or pandas), and read them with h5py. That said, you should be able to use them together (so long as the HDF5 library versions are compatible). – kcw78 Jan 11 '23 at 22:37
  • 1
    Regarding the h5py/pytables integration, there was a workshop (in 2018?) where the developers investigated the challenges. The conclusion -- they decided not to proceed. Can't remember where I read that. Maybe on the PyTables forum? – kcw78 Jan 11 '23 at 22:37

1 Answers1

0

I don't have any concrete knowledge regarding the answer, but it seems unlikely that this change would be made. Apparently pytables has much faster write speeds than h5py and would lose that if implemented on top of h5py.

in-tension
  • 71
  • 5