20

I have a large hdf5 file that looks something like this:

A/B/dataset1, dataset2
A/C/dataset1, dataset2
A/D/dataset1, dataset2
A/E/dataset1, dataset2

...

I want to create a new file with only that: A/B/dataset1, dataset2 A/C/dataset1, dataset2

What is the easiest way in python?

I did:

fs = h5py.File('source.h5', 'r')
fd = h5py.File('dest.h5', 'w')
fs.copy('group B', fd)

the problem is that I get for dest.h5:

B/dataset1, dataset2

and that I am missing part of the arborescence.

graham
  • 335
  • 1
  • 3
  • 10

1 Answers1

34

fs.copy('A/B', fd) doesn't copy the path /A/B/ into fd, it only copies the group B (as you've found out!). So you first need to create the rest of the path:

fd.create_group('A')
fs.copy('A/B', fd['/A'])

or, if you will be using the group a lot:

fd_A = fd.create_group('A')
fs.copy('A/B', fd_A)

This copies the group B from fs['/A/B'] into fd['/A']:

In [1]: fd['A/B'].keys()
Out[1]: [u'dataset1', u'dataset2']

Here's an automatic way of doing this:

# Get the name of the parent for the group we want to copy
group_path = fs['/A/B'].parent.name

# Check that this group exists in the destination file; if it doesn't, create it
# This will create the parents too, if they don't exist
group_id = fd.require_group(group_path)

# Copy fs:/A/B/ to fd:/A/G
fs.copy('/A/B', group_id, name="G")

print(fd['/A/G'].keys())
# [u'dataset1', u'dataset2']
Yossarian
  • 5,226
  • 1
  • 37
  • 59
  • thanks, I was just hoping that you could do that without manually creating the groups closer to the root one by one manually (there are more of them in my file, it was an illustration). – graham Jul 03 '14 at 02:16
  • also, how can you copy B into a different name? and more generally, is it possible to rename individually groups or data sets? – graham Jul 03 '14 at 02:18
  • I've added a way to do this automatically. You could wrap it up into a function, passing it the group in fs you want to copy and the file handle of the destination. You can rename groups and datasets with [`move(source,dest)`](http://docs.h5py.org/en/latest/high/group.html?highlight=parent#Group.move) – Yossarian Jul 03 '14 at 07:10