I'm trying to read the SVHN dataset (http://ufldl.stanford.edu/housenumbers/) [the full version]. It's hdf5, so I tried to use h5py (since pandas takes a while to read it).
Thus, I tried the method described in https://stackoverflow.com/a/41579641/1745291 but on my system (Archlinux latest, h5py 2.8.0
, hdf5 1.10.2-3
, Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz), it is SUPER SLOW : >30s to read a single filename...
Is it a bug on this version ? Is it the expected access time ? (would be hard to believe since this format is reputed for this)... Am I doing something wrong ?
...Note : I also found this thread with no responses : https://groups.google.com/forum/#!topic/h5py/4eHydpsQ1HU