-1

I am trying to create a variable number of variables (arrays) in python.

I have a database from experiments and I am extracting data from it. I do not have control over the database or how data is written. I am extracting data in the form of a table - first (or zeroth column from python's perspective) has location ids and subsequent columns have readings over several iterations. Location ids (in 0th col) span over million of rows, and so the readings of the iterations are captured in subsequent columns. So I read over the database and create this giant table.

In the next step, I loop over columns index 1 to n (0th col has locations) and I am trying to get this - if the difference in 2 readings is more than 0.001, then write the location id to an array.

 if ( (A[i][j+1] - A[i][j]) > 0.001): #1<=j<=n, 0<=i<=max rows in the table
   then write A[i][0] i.e. location id to an array, arr1[m][n] = A[i][0]

Problem: It is creating dynamic number of variables like arr1. I am storing the result of each loop iteration in an array and the number of column j's are known only during runtime. So how can I create variable number of variables like arr1? Secondly, each of these variables like arr1 can have different size.

I took a look at similar questions, but multi-dimension arrays won't work as each arr1 can have different size. Also, performance is important, so I am guessing numpy arrays would be better. I am guessing that dictionary would be slow in performance for such a huge data.

ewr34
  • 3
  • 3

2 Answers2

1

I didn't understand much from your explanation of the problem, but from what you wrote it sounds like a normal list would do the job:

arr1 = []

if (your condition here):
    arr1.append(A[i][0])

memory management is dynamic, i.e. it allocates new memory as needed and afterwards if you need a numpy array just make numpy_array = np.asarray(arr1).

A (very) small primer on lists in python:

A list in python is a mutable container that stores references to objects of any kind. Unlike C++, in a python list your items can be anything and you don't have to specify the list size when you define it. In the example above, arr1 is initially defined as empty and every time you call arr1.append() a new reference to A[i][0] is pushed at the end of the list.

For example:

a = []
a.append(1)
a.append('a string')
b = {'dict_key':'my value'}
a.append(b)
print(a)

displays:

[1, 'a string', {'dict_key': 'my value'}]

As you can see, the list doesn't really care what you append, it will store a reference to the item and increase its size of 1.

I strongly suggest you to take a look at the daa structures documentation for further insight on how lists work and some of their caveats.

Community
  • 1
  • 1
GPhilo
  • 18,519
  • 9
  • 63
  • 89
  • I want to create multiple `arr1`, that's the problem. How many are not known until program run. And each `arr1` can have different size. – ewr34 Jul 10 '17 at 15:02
  • You don't need to "make" any variable. You're just storing it in a list (which can have anything at its elements, unlike for example in C++) – GPhilo Jul 10 '17 at 15:04
  • I would prefer to create variables `arr1, arr2...` beforehand for performance benefit. And assign rows dynamically. Is there any way to do that? – ewr34 Jul 10 '17 at 15:09
  • No, unless you know the number beforehand (which you said is not the case). Under the hood allocation *may* me slightly optimized but you don't have control on this, so all you're left with is appending to a slowly growing list – GPhilo Jul 10 '17 at 15:11
  • I mean runtime. I can find during runtime that number of `arr1` variables to create are say 10. Then can I create `arr1, arr2,..arr10`? – ewr34 Jul 10 '17 at 15:17
  • Since each single `arrX` variable is (from what I understood of your explanation) itself a list of variable size, I can't think of an efficient way to preallocate memory even knowing the total number of `arrX` variables you need to have at runtime – GPhilo Jul 10 '17 at 15:20
0

I took a look at similar questions, but multi dimension arrays won't work as each arr1 can have different size.

-- but a list of arrays will work, because items in a list can be anything, including arrays of different sizes.

Błotosmętek
  • 12,717
  • 19
  • 29