How to create indeterminate number of arrays in Python

Question

I'm a student who has to use Python programming for a project, but I am not very good at it. In my project I need to create a number of arrays whose number is not determinate until the condition in the problem is met, in which case the problem stops and the output must be the arrays created. I coded the project as follows:

def a1(i, j, wi, a):
    sum0 = 0
    for z in range(0, i, 1):
        sum0 = a[j][z] * wi[z] + sum0
    return sum0
#__________________

rb = [125, 120, 81, 70, 60, 52, 48, 30, 28,22,18]
Ru = 645
n = len(rb)
wi = rb
import numpy as np
a = np.zeros((1000, len(wi)))
import math
a[1][0] = math.floor(a[1][0])
j = 1

while j < 2:
    a[j][0] = (Ru / wi[0])
    a[j][0] = math.floor(a[j][0])
    for i in range(1, n, 1):
        a[j][i] = ((Ru - a1(i, j, wi, a)) / wi[i])
        
        a[j][i] = math.floor(a[j][i])
    j = j + 1
j = j - 1
k = n - 2

while k >= 0:
    if k >= 0:
        while a[j][k] > 0:

            j = j + 1
            for i in range(0, n, 1):
                if i < k:
                    a[j][i] = a[j - 1][i]
                if i == k:
                    a[j][i] = a[j - 1][i] - 1
                if i > k:
                    a[j][i] = ((Ru - a1(i, j, wi, a)) / wi[i])           

                    a[j][i] = math.floor(a[j][i])
            k = n - 2

     k = k - 1
print(a)

To solve it, I defined a matrix of 1000 * n, when n <7 the program is running but for larger values the following error is observed:

a[j][i]=a[j-1][i]

IndexError: index 1000 is out of bounds for axis 0 with size 1000

I tried to fix this error, I changed the matrix size from 1000 to 10000000, which gives this error:

a = np.zeros((1000000000, len(wi)))

ValueError: array is too big; arr.size * arr.dtype.itemsize is larger than the maximum possible size.

Please help me if possible to solve this problem.

Thanks

score 2 · Answer 1 · edited May 20 '21 at 08:51

2

I noticed that your indentation is no correct(in the second while loop) and that you import the same package multiple times(you online need to do it once). Regarding this error:

a[j][i]=a[j-1][i]

IndexError: index 1000 is out of bounds for axis 0 with size 1000

It means that you're trying to access a position out of the range between [0,1000] maybe instead of resizing the array you should check what are the values that j and i are during the iterations.

edited May 20 '21 at 08:51

CoolCoder

786
7
20

answered May 19 '21 at 14:27

Namantyagi

21
1

Thanks for the corrections you made. But errors still remain. Is there a way to get to an indeterminate number (I mean, given that we do not know before the final number of arrays and stop building arrays only if a condition is met) ?? – fereshteh ghafari May 19 '21 at 14:48

score 0 · Answer 2 · answered May 19 '21 at 14:22

0

I noticed that your indentation is no correct(in the second while loop) and that you import the same package multiple times(you online need to do it once). Regarding this error:

a[j][i]=a[j-1][i]

IndexError: index 1000 is out of bounds for axis 0 with size 1000

it means that you're trying to access a position out of the range between [0,1000] maybe instead of resizing the array you should check what are the values that j and i are during the iterations.

answered May 19 '21 at 14:22

drauedo

641
4
17

1

Thanks for the corrections you made. But errors still remain. Is there a way to get to an indeterminate number (I mean, given that we do not know before the final number of arrays and stop building arrays only if a condition is met) ?? – fereshteh ghafari May 19 '21 at 14:48
You could build an empty list and with append() you could add as many arrays as you want and control this with and if condition. – drauedo May 19 '21 at 15:02
Is it possible for you to share a small example? I could not do this with lists, so I used the Numpy library but because this library also did not have the append feature, I thought the definition of a large matrix would be effective, which also did not have an acceptable result. If you have a solution, thank you for sharing, please. – fereshteh ghafari May 19 '21 at 15:24

score 0 · Answer 3 · answered May 19 '21 at 14:41

0

A ValueError (like yours) in numpy is generated when allocation size is quite big. In this case, you are trying to declare a matrix, a, of dimensions 10^9 by 11. If you calculate the total number of cells in your created matrix, it comes around 11*(10^9).

When numpy tries to allocate such a large amount of cells inside your computer's memory, you have to also think that each of this cell will be allocated a fixed number of bits depending upon Python's size for int data type. So, even if you consider a normal size of 8 bytes (64 bits), your matrix will be hypothetically allocated about 88 GB, which obviously is quite large.

Even for 4 bytes of int type size, you might be allocating 44 GB of your memory to your matrix.

Try to reduce the size of your matrix or think about of an upper bound on the matrix size you might need.

For more, visit this.

answered May 19 '21 at 14:41

Shubham Jain

31
1
6

Thank you for your explanation. Excuse me, you know a way that does not need to define such a large hypothetical matrix (I mean there was an add feature in the lists, for example if the list was acceptable it can be added(.append()) to a predefined empty list but in the array There is no such property, which is why such a large matrix is defined) ?? If you have an example or way to solve this problem, thank you for sharing. – fereshteh ghafari May 19 '21 at 15:11
So rather than predefining 10^9 rows for your matrix, I think you can start with a 1 by n matrix, and then use the vstack functionality offered by numpy to append rows to your matrix as and where required. [This](https://numpy.org/doc/stable/reference/generated/numpy.vstack.html) might be helpful to understand about vstack. – Shubham Jain May 19 '21 at 20:15
Thank you very much for what you said. I will try to read more of what you said so that I can use it properly. – fereshteh ghafari May 20 '21 at 06:11

ELinda · Accepted Answer · 2021-05-20T14:15:18.937

It seems like you are dealing with sparse data. In this case, it is more efficient to avoid allocating a fixed-size data structure. Instead, use a sparse representation such as a defaultdict with a default value of 0.

Main approach:

Initialize the defaultdict with a default of 0, (by designating it to store integer values).
Use the csr_matrix function to convert it to a numpy array (ndarray).

There are some other points:

There is no need to use a while loop when you are just going through a range (whether ascending or descending)
Use the // (floor division operator) instead of taking the floor after a division. This will be more direct.
Instead of using if statements to compare i and k, just loop i through ranges [0,k), [k+1,n), and have one line for the i==k assignment.

Here's an example with smaller numbers (but it should also work with the bigger ones)

def a1(i, j, wi, data):
    sum0 = 0
    for z in range(0, i, 1):
        sum0 = data[(j, z)] * wi[z] + sum0
    return sum0

#_________________
rb = [13, 3, 5]
Ru = 6
n = len(rb)
wi = rb

import numpy as np
from collections import defaultdict
from scipy.sparse import csr_matrix


data = defaultdict(int)
data[(1, 0)] = 0
j_max = 2
for j in range(1, j_max):
    data[(j, 0)] = Ru // wi[0]
    for i in range(1, n, 1):
        data[(j, i)] = (Ru - a1(i, j, wi, data)) // wi[i]
j = j_max - 1
k = n - 2

for k in range(n-2, 0, -1):
    while data[(j, k)] > 0:
        j = j + 1
        for i in range(0, k):
            data[(j, i)] = data[(j - 1, i)]
        
        data[(j, k)] = data[(j - 1, k)] - 1
        
        for i in range(k+1, n):
            data[(j, i)] = (Ru - a1(i, j, wi, data)) // wi[i]
        k = n - 2

row_indices = [r for r, c in list(data.keys())]
col_indices = [c for r, c in list(data.keys())]

// Determine rows to be allocated, based on dict keys
row_count = max(row_indices) + 1
data_ndarr = csr_matrix((list(data.values()), (row_indices, col_indices)), shape=(row_count, len(wi))).toarray()

print(data_ndarr)

Result:

[[0 0 0]
 [0 2 0]
 [0 1 0]
 [0 0 1]]

Thank you very much for your time and the example you provided. But I can not run this example because it gives this error: >>> data_ndarr = csr_matrix((list(data.values()), (row_indices, col_indices)), shape=(row_count, len(wi))).toarray() NameError: name 'row_count' is not defined. Please guide the reason for this error. — fereshteh ghafari, May 20 '21 at 06:21
And Excuseme, I want the result to be as follows (given this small example): [[0 0 0] [0 2 0] [0 0 1]] — fereshteh ghafari, May 20 '21 at 06:43
See above. Was missing the line `row_count = max(row_indices) + 1` and also found a better way to replace the three consecutive conditionals. — ELinda, May 20 '21 at 14:16

How to create indeterminate number of arrays in Python

4 Answers4