8

I want to read the Street View House Numbers (SVHN) Dataset by using h5py

In [117]: def printname(name):
     ...:     print(name)
     ...:

In [118]: data['/digitStruct'].visit(printname)
bbox
name

There are two group in the data, bbox and name, name is the group name corresponding to the file name data, and bbox is the group name corresponding to the width, height, top, left and label data.

How can I visit all the data in name and bbox group?

I have tried with the following code from the Docs, but it just return HDF5 object reference.

In [119]: for i in data['/digitStruct/name']:
     ...:     print(i[0])
     ...:
     ...:
<HDF5 object reference>
<HDF5 object reference>
<HDF5 object reference>
<HDF5 object reference>
<HDF5 object reference>
<HDF5 object reference>

Python version: 3.5 and OS: Windows 10.

GoingMyWay
  • 16,802
  • 32
  • 96
  • 149

2 Answers2

16

I'll answer my question here, after read the docs of h5py, here is my code

def get_box_data(index, hdf5_data):
    """
    get `left, top, width, height` of each picture
    :param index:
    :param hdf5_data:
    :return:
    """
    meta_data = dict()
    meta_data['height'] = []
    meta_data['label'] = []
    meta_data['left'] = []
    meta_data['top'] = []
    meta_data['width'] = []

    def print_attrs(name, obj):
        vals = []
        if obj.shape[0] == 1:
            vals.append(obj[0][0])
        else:
            for k in range(obj.shape[0]):
                vals.append(int(hdf5_data[obj[k][0]][0][0]))
        meta_data[name] = vals

    box = hdf5_data['/digitStruct/bbox'][index]
    hdf5_data[box[0]].visititems(print_attrs)
    return meta_data

def get_name(index, hdf5_data):
    name = hdf5_data['/digitStruct/name']
    return ''.join([chr(v[0]) for v in hdf5_data[name[index][0]].value])

Here the hdf5_data is train_data = h5py.File('./train/digitStruct.mat'), it works fine!

Update

Here is some sample code to use the above two functions

mat_data = h5py.File(os.path.join(folder, 'digitStruct.mat'))
size = mat_data['/digitStruct/name'].size

for _i in tqdm.tqdm(range(size)):
    pic = get_name(_i, mat_data)
    box = get_box_data(_i, mat_data)

The above function shows how to get the name and the bbox data of each entry of the data!

GoingMyWay
  • 16,802
  • 32
  • 96
  • 149
  • In this should we input 0 for index while calling the function? And also could you explain how this function works? – Dhruv Marwha Jul 30 '17 at 13:31
  • 1
    Thank you.After using your code,I figured this out myself also.Could you throw some light on the print_attrs function? – Dhruv Marwha Jul 31 '17 at 04:04
  • @DhruvMarwha, `print_attrs` is a function to get data in each bbox. And you can read the docs of [visititems of h5py](http://docs.h5py.org/en/latest/quick.html?highlight=visititems) and also to see the details of `box = hdf5_data['/digitStruct/bbox'][index]` and `hdf5_data[box[0]]` to know the details of the code. Note that `h5py` is not a high performance library, so you can picklized the data first and use it. – GoingMyWay Aug 01 '17 at 01:31
  • I'm assuming you have used this dataset @Alexander Yau.In this as you know there are lenghts of lines for boundary boxes.I was thinking of drawing those boxes around the characters.I tried but since i do not the starting points,i cannot draw the boundary boxes around the digits.Any thoughts? – Dhruv Marwha Aug 01 '17 at 17:38
  • @DhruvMarwha, `SVHN` data contains the bbox data, so you can try the above function to get the box data. – GoingMyWay Aug 02 '17 at 02:21
0

I make a similar function to get the metadata of the bounding boxes based on your answer, but without using visititems() since I'm still not familiar with it. Instead, it makes use of the dictionary properties of the h5py.File.

f = h5py.File(digi_file, 'r')
bboxs = f['digitStruct/bbox']

def get_img_boxes(f, idx=0):
    """
    get the 'height', 'left', 'top', 'width', 'label' of bounding boxes of an image
    :param f: h5py.File
    :param idx: index of the image
    :return: dictionary
    """
    meta = { key : [] for key in ['height', 'left', 'top', 'width', 'label']}

    box = f[bboxs[idx][0]]
    for key in box.keys():
        if box[key].shape[0] == 1:
            meta[key].append(int(box[key][0][0]))
        else:
            for i in range(box[key].shape[0]):
                meta[key].append(int(f[box[key][i][0]][()].item()))

    return meta

To get the bounding boxes information of a specific image, just call the function with an index:

boxes = get_img_boxes(f, 2)
Leo
  • 58
  • 8