0

In spite of finding several resources around this task, I am unable to extract the data from an hdf5 file (for a Datacamp course I am working through).

I would like to work the examples on my local system.

This is what I have:

import h5py
import numpy as np
import pandas as pd

filename = "e:\\python\\datacamp\\time_series_with_python\\machine_learning_for_time_series_data_in_python\\audio_munged.hdf5"
'''
data = pd.read_hdf(filename, key)
data = pd.read_hdf(filename, 'h5io')
print(data)
# Reading the file
'''

f = h5py.File(filename, 'r')

# Studying the structure of the file by printing what HDF5 groups are present

for key in f.keys():
    print(key)  # Names of the groups in HDF5 file.

    # Extracting the data

    # Get the HDF5 group
    group = f[key]

    # Checkout what keys are inside that group.
    for gkey in group.keys():
        print(gkey)
    # key_data = group['key_data'].value   <-- This generates an error

I get this:

h5io
key_data
key_meta
key_sfreq

Traceback (most recent call last):
  File "C:/Users/Mark/PycharmProjects/main/main.py", line 27, in <module>
    key_data = group['key_data'].value
AttributeError: 'Group' object has no attribute 'value'

Question #1: How do I extract the values from 'key_data', key_meta' and 'key_sfreq' into separate numpy arrays?

Question #2: I tried simply pulling the hdf5 file into pandas (code above), but I do not know what the correct value of the 'key' parameter should be. I tried h5io but that is not correct. Then I tried the other three (key_ ) and all of those fail as well.

MarkS
  • 1,455
  • 2
  • 21
  • 36

1 Answers1

0

I found what I was looking for here:

https://stackoverflow.com/questions/34330283/how-to-differentiate-between-hdf5-datasets-and-groups-with-h5py

I wasn't going deep enough. The code posted by user "Yas" showed me how to get to the values.

MarkS
  • 1,455
  • 2
  • 21
  • 36