0

I just read about pytables an saw a presentation-video from the creator of pytables. It looks very promising, but also a bit complicated and I have to install serval packages. So I hope bevor doing that it is ok to ask here a question. What I need to do is:

np_array = numpy.zeros((len(example1), len(example2)), dtype=int)
'''filling the np_array'''
np_array_sum = np_array.sum(0)
np_array_sum_ordered = np_array_sum.argsort()[::-1]
np_array = numpy.take(np_array, np_array_sum_ordered, axis=1)

Is it possible to give pytables the np_array and do the numpy-operations or do I need to do it with pytable operations? If I need to use pytables methods, how can I can the same thing?

cel
  • 30,017
  • 18
  • 97
  • 117
Oli
  • 1,221
  • 2
  • 12
  • 18
  • This question is very similar to this: http://stackoverflow.com/questions/8642626/building-a-huge-numpy-array-using-pytables – ctrl-alt-delete Jan 09 '16 at 20:34
  • @toasteez thank you for the answer, but I know who to create an array. I want to know if I can use the numpy methods/functions to rearrange the array. How to create an array I saw in the presentation. – Oli Jan 09 '16 at 21:02
  • You need to figure out what your goal is. pytables are a storage format either in memory or to disk. – ctrl-alt-delete Jan 09 '16 at 21:31
  • You need to figure out what your optimal use case is. pytables are a storage format either in memory or to disk. You can read data out and write data to them like you can to an RDBMS. numpy has many in-built functions for manipulation of arrays. – ctrl-alt-delete Jan 09 '16 at 21:38
  • For the above operation I don't have enough memory, it is over 64+GB. And it is not the biggest data I need to run. But I need to order the array. The new order I get with 'np_array_sum.argsort()[::-1]' and then I simply do 'numpy.take(np_array, np_array_sum_ordered, axis=1)'. For small data it works well and fast. But with bigger data python simply crashes without and error. The array matrix is about 15GB when it is filled with all the data. Later with even bigger data I think it will get up to 60GB. If you have a great idea, I am all ears :) – Oli Jan 09 '16 at 22:28
  • How about this question? http://stackoverflow.com/questions/32312446/argsort-on-a-pytables-array : "`Since you are working with Pytables, I suggest you use the Table class which has sorting built in.`". Numpy and pytables do seem to be in a sense compatible (looking at [the "getting started"](http://www.pytables.org/usersguide/tutorials.html), but I would guess that either you want to work with a smaller data subset (-> use numpy arrays, numpy operations) *or* the full 60 GB thing (-> keep table and its native methods). I could be totally wrong, of course. – Andras Deak -- Слава Україні Jan 10 '16 at 00:16
  • Thank you Andras Deak for the link. I will look into it. Looks very interesting. – Oli Jan 10 '16 at 16:35
  • Also take a look at this link http://www.pytables.org/usersguide/optimization.html its a little advanced but from what you describe you need to sort the array and then lazyload it in chunks to perform your operations to ensure that you do not overflow your memory space. Good luck – ctrl-alt-delete Jan 11 '16 at 12:28
  • Thank you toasteez. Pytables is a very interesting tool. – Oli Jan 11 '16 at 23:28

0 Answers0