I want to store numpy datetime64
data in a PyTables Table
. I want to do this without using Pandas.
What I've tried so far
Setup
In [1]: import tables as tb
In [2]: import numpy as np
In [3]: from datetime import datetime
create data
In [4]: data = [(1, datetime(2000, 1, 1, 1, 1, 1)), (2, datetime(2001, 2, 2, 2, 2, 2))]
In [5]: rec = np.array(data, dtype=[('a', 'i4'), ('b', 'M8[us]')])
In [6]: rec # a numpy array with my data
Out[6]:
array([(1, datetime.datetime(2000, 1, 1, 1, 1, 1)),
(2, datetime.datetime(2001, 2, 2, 2, 2, 2))],
dtype=[('a', '<i4'), ('b', '<M8[us]')])
Open PyTables dataset with Time64Col
descriptor
In [7]: f = tb.open_file('foo.h5', 'w') # New PyTables file
In [8]: d = f.create_table('/', 'bar', description={'a': tb.Int32Col(pos=0),
'b': tb.Time64Col(pos=1)})
In [9]: d
Out[9]:
/bar (Table(0,)) ''
description := {
"a": Int32Col(shape=(), dflt=0, pos=0),
"b": Time64Col(shape=(), dflt=0.0, pos=1)}
byteorder := 'little'
chunkshape := (5461,)
Append NumPy data to PyTables dataset
In [10]: d.append(rec)
In [11]: d
Out[11]:
/bar (Table(2,)) ''
description := {
"a": Int32Col(shape=(), dflt=0, pos=0),
"b": Time64Col(shape=(), dflt=0.0, pos=1)}
byteorder := 'little'
chunkshape := (5461,)
What happened to my datetimes?
In [12]: d[:]
Out[12]:
array([(1, 0.0), (2, 0.0)],
dtype=[('a', '<i4'), ('b', '<f8')])
I understand that HDF5 doesn't provide native support for datetimes. I would expect that the extra metadata that PyTables overlays would handle this though.
My Question
How can I store a numpy record array that contains datetimes in PyTables? How can I efficiently extract that data from a PyTables table back to a NumPy array and retain my datetimes?
Common answer
I commonly get this answer:
Use Pandas
I don't want to use Pandas because I don't have an index, I don't want one stored in my dataset, and Pandas doesn't allow you to not have/store an index (see this question)