I have the following numpy compound datatype:
mytype = numpy.dtype([('x', 'f8'),
('y', 'f8'),
('z', 'f8'))])
However, when I try to fill a vector of this type, it 60x slower than three separate arrays:
#!/usr/bin/env python3
import time
import random
import numpy
mytype = numpy.dtype([('x', 'f8'),
('y', 'f8'),
('z', 'f8')])
size = 1000000
v = numpy.empty(shape=(size,), dtype=mytype)
print("Start inserting into compound type:")
start = time.time()
for i in range(size):
v[i]['x'] = random.random()
v[i]['y'] = random.random()
v[i]['z'] = random.random()
end = time.time()
print("Done inserting into compound type: Time elapsed: {}.\n".format(end - start))
x = numpy.empty(shape=(size,), dtype='f8')
y = numpy.empty(shape=(size,), dtype='f8')
z = numpy.empty(shape=(size,), dtype='f8')
print("Inserting into three arrays:")
start = time.time()
for i in range(size):
x[i] = random.random()
y[i] = random.random()
z[i] = random.random()
end = time.time()
print("Done inserting into three arrays. Time elapsed: {}".format(end - start))
print("Reading from compound type:")
start = time.time()
for i in range(size):
x1 = v[i]['x']
y1 = v[i]['y']
z1 = v[i]['z']
end = time.time()
print("Done reading compound type: Time elapsed: {}.\n".format(end -start))
print("Reading from three arrays:")
start = time.time()
for i in range(size):
x1 = x[i]
y1 = y[i]
z1 = z[i]
end = time.time()
print("Done reading three arrays. Time elapsed: {}.\n".format(end - start))
In addition, I find that reading numpy compound datatypes 70x slower than the corresponding separated datatypes. How I can increase the performance of numpy compound datatypes?
Edit: After cloning numpy from master, this performance bug went away.