Reposted from https://groups.google.com/forum/#!topic/pydata/5mhuatNAl5g
It seems when creating a DataFrame from a structured array that the data is copied? I get similar results if the data is instead a dictionary of numpy arrays.
Is there anyway to create a DataFrame from a structured array or similar without any copying or checking?
In [44]: sarray = randn(1e7,10).view([(name, float) for name in 'abcdefghij']).squeeze()
In [45]: for N in [10,100,1000,10000,100000,1000000,10000000]:
...: s = sarray[:N]
...: %timeit z = pd.DataFrame(s)
...:
1000 loops, best of 3: 830 µs per loop
1000 loops, best of 3: 834 µs per loop
1000 loops, best of 3: 872 µs per loop
1000 loops, best of 3: 1.33 ms per loop
100 loops, best of 3: 15.4 ms per loop
10 loops, best of 3: 161 ms per loop
1 loops, best of 3: 1.45 s per loop
Thanks, Dave