Questions tagged [h5py]

h5py is a NumPy-compatible Python module for handling The Hierarchical Data Format (HDF5) files.

h5py is a NumPy-compatible Python module for handling The Hierarchical Data Format (HDF5) files.

Main features

  • Free (BSD licensed)
  • limited dependencies (Python, NumPy, HDF5 libs.)
  • includes both a low level c-like HDF5 interface and a high level Python/NumPy style interface
  • directly interact with datasets using NumPy metaphors, such as slicing
  • datatypes specified using standard NumPy dtype objects

Some links to get started

1301 questions
120
votes
2 answers

Input and output numpy arrays to h5py

I have a Python code whose output is a sized matrix, whose entries are all of the type float. If I save it with the extension .dat the file size is of the order of 500 MB. I read that using h5py reduces the file size considerably. So, let's say I…
lovespeed
  • 4,835
  • 15
  • 41
  • 54
114
votes
1 answer

Is there an analysis speed or memory usage advantage to using HDF5 for large array storage (instead of flat binary files)?

I am processing large 3D arrays, which I often need to slice in various ways to do a variety of data analysis. A typical "cube" can be ~100GB (and will likely get larger in the future) It seems that the typical recommended file format for large…
Caleb
  • 3,839
  • 7
  • 26
  • 35
67
votes
2 answers

How to append data to one specific dataset in a hdf5 file with h5py

I am looking for a possibility to append data to an existing dataset inside a .h5 file using Python (h5py). A short intro to my project: I try to train a CNN using medical image data. Because of the huge amount of data and heavy memory usage during…
Midas.Inc
  • 1,730
  • 3
  • 13
  • 25
50
votes
5 answers

Installing h5py on an Ubuntu server

I was installing h5py on an Ubuntu server. However it seems to return an error that h5py.h is not found. It gives the same error message when I install it using pip or the setup.py file. What am I missing here? I have Numpy version 1.8.1, which…
Devil
  • 903
  • 2
  • 13
  • 21
50
votes
3 answers

How to overwrite array inside h5 file using h5py

I'm trying to overwrite a numpy array that's a small part of a pretty complicated h5 file. I'm extracting an array, changing some values, then want to re-insert the array into the h5 file. I have no problem extracting the array that's nested. f1…
user3508433
  • 501
  • 1
  • 4
  • 3
46
votes
2 answers

Experience with using h5py to do analytical work on big data in Python?

I do a lot of statistical work and use Python as my main language. Some of the data sets I work with though can take 20GB of memory, which makes operating on them using in-memory functions in numpy, scipy, and PyIMSL nearly impossible. The…
Josh Hemann
  • 940
  • 10
  • 12
45
votes
6 answers

Error opening file in H5PY (File signature not found)

I've been using the following bit of code to open some HDF5 files, produced in MATLAB, in python using H5PY: import h5py as h5 data='dataset.mat' f=h5.File(data, 'r') However I'm getting the following error: OSError: Unable to open file (File…
Anisha Singh
  • 483
  • 1
  • 4
  • 6
38
votes
6 answers

How to list all datasets in h5py file?

I have a h5py file storing numpy arrays, but I got Object doesn't exist error when trying to open it with the dataset name I remember, so is there a way I can list what datasets the file has? with h5py.File('result.h5','r') as hf: #How…
matchifang
  • 5,190
  • 12
  • 47
  • 76
35
votes
6 answers

Read HDF5 file into numpy array

I have the following code to read a hdf5 file as a numpy array: hf = h5py.File('path/to/file', 'r') n1 = hf.get('dataset_name') n2 = np.array(n1) and when I print n2 I get this: Out[15]: array([[, , …
e9e9s
  • 885
  • 2
  • 13
  • 24
35
votes
6 answers

How to store dictionary in HDF5 dataset

I have a dictionary, where key is datetime object and value is tuple of integers: >>> d.items()[0] (datetime.datetime(2012, 4, 5, 23, 30), (14, 1014, 6, 3, 0)) I want to store it in HDF5 dataset, but if I try to just dump the dictionary h5py raises…
theta
  • 24,593
  • 37
  • 119
  • 159
34
votes
3 answers

How to install h5py (needed for Keras) on MacOS with M1?

I have an M1 MacBook. I have installed python 3.9.1 using pyenv, and have pip3 version 21.0.1. I have installed homebrew and hdf5 1.12.0_1 via brew install hdf5. When I type pip3 install h5py I get the error: Requirement already satisfied:…
Racing Tadpole
  • 4,270
  • 6
  • 37
  • 56
34
votes
2 answers

Incremental writes to hdf5 with h5py

I have got a question about how best to write to hdf5 files with python / h5py. I have data like: ----------------------------------------- | timepoint | voltage1 | voltage2 | ... ----------------------------------------- | 178 | 10 | 12…
user116293
  • 5,534
  • 4
  • 25
  • 17
31
votes
6 answers

Combining hdf5 files

I have a number of hdf5 files, each of which have a single dataset. The datasets are too large to hold in RAM. I would like to combine these files into a single file containing all datasets separately (i.e. not to concatenate the datasets into a…
Bitwise
  • 7,577
  • 6
  • 33
  • 50
31
votes
3 answers

Check if node exists in h5py

I am wondering if there is a simple way to check if a node exists within an HDF5 file using h5py. I couldn't find anything in the docs, so right now I'm using exceptions, which is ugly. # check if node exists # first assume it exists e = True try: …
troy.unrau
  • 1,142
  • 2
  • 12
  • 26
30
votes
3 answers

Storing a list of strings to a HDF5 Dataset from Python

I am trying to store a variable length list of string to a HDF5 Dataset. The code for this is import h5py h5File=h5py.File('xxx.h5','w') strList=['asas','asas','asas'] h5File.create_dataset('xxx',(len(strList),1),'S10',strList) h5File.flush()…
gman
  • 1,242
  • 2
  • 16
  • 29
1
2 3
86 87