1

I have two matrices A,B both of shape 150000 X 150000

I want to divide each element of A with each element of B, element wise. The way I currently do it is -

res=A/B

I do get the output for small matrices, but for large matrices as mine. The process gets killed. Any suggestions on how to do this efficiently ?

sample data

A =
[
[2.2,3.3,4.4]
[2.2,3.3,4.4]
[2.2,3.3,4.4]]

B=
[
[1,3.3,4.4]
[2.2,1,4.4]
[2.2,3.3,1]]

res = 
[
[2.2,1,1]
[1,3.3,1]
[1,1,4.4]]



This is a 3 X 3 matrix, I'm working with 150000 X 150000 matrix

Neo
  • 676
  • 1
  • 4
  • 12
  • You would need at least > 167GB of RAM (float64) just for storing the result. Double that for the second division argument. If you don't have that, you can't do it without out-of-memory computing [wiki](https://en.wikipedia.org/wiki/External_memory_algorithm#:~:text=In%20computing%2C%20external%20memory%20algorithms,computer's%20main%20memory%20at%20once.&text=External%20memory%20algorithms%20are%20analyzed%20in%20the%20external%20memory%20model.) which is not in numpy's world anymore. – sascha Jan 06 '21 at 18:27
  • I do have an additional ram of upto 200GB. Can you let me know how exactly you came up with the mem requirements? – Neo Jan 06 '21 at 18:32
  • 150000^2 * 8 byte (float64) for a single matrix. If you do in-place division (if allowed) you would need two of those (A and B) to compute in-place storing in res: `res <-> A /= B`. (Meaning: 200GB is not enough) – sascha Jan 06 '21 at 18:33
  • 1
    If `A` is sparse, you can use `scipy.sparse` https://docs.scipy.org/doc/scipy/reference/sparse.html package. But I suppose `B` can't contain any zeros. So you will still need at least the 167GB to store the dense `B` as @Neo mentioned. – mandulaj Jan 06 '21 at 18:47
  • 3
    Well you can just read it line by line and write the results line by line. Since you are talking about element wise computations. Use two chunk readers to get A and B and write the result using a third one. Examples [here](https://stackoverflow.com/questions/42727412/efficient-way-to-partially-read-large-numpy-file/42727761) and [here](https://stackoverflow.com/questions/29067406/how-to-read-a-super-huge-file-into-numpy-array-n-lines-at-a-time). – Thymen Jan 06 '21 at 19:01
  • @sascha What did you mean by "not in numpy's world anymore"? AFAICS `numpy.memmap` is alive and well (I see no deprecation notices) so this question may well be a dupe of https://stackoverflow.com/questions/16149803 – jez Jan 06 '21 at 20:38

2 Answers2

1

you could try to use pandas and set the type of the values to something small memory wise, or at least check the memory allocated for a value, usually Python uses float64 or so, which is in some cases way too much.

use

pd.to_numeric(s, errors='coerce') 

or

pd.to_numeric(column, downcast='integer')
Petronella
  • 2,327
  • 1
  • 15
  • 24
  • I'm pretty sure he is using `numpy`, not python lists. `pandas`'s dtypes are based on `numpy`. But changing the dtype, say from a default `float64` to 32 only cuts memory usage in half. `pandas` has a lot of overhead compared to `numpy` (e.g. row and column indices). – hpaulj Jan 06 '21 at 23:12
1

If the problem is a limitation on memory allocation (i.e. you have enough RAM for A and B, but not enough for A, B and res all together) then in-place division /= will do the job in the memory space already allocated for A without having to allocate memory for a new array res. Of course, you'll overwrite the original content of A in the process:

A /= B

But if you end up needing to use arrays that are too large to fit in RAM then you should explore the use of numpy.memmap which is designed for this purpose. See for example Working with big data in python and numpy, not enough ram, how to save partial results on disc?

jez
  • 14,867
  • 5
  • 37
  • 64
  • But `/=` still has to create a temporary buffer. There is a `np.divide.at` that does unbuffered division, but these `at` methods are more useful for dealing with repeated indices than reducing memory use. – hpaulj Jan 06 '21 at 23:14
  • @hpaulj I’m surprised to learn about the buffering. Is it not still the case that `A/=B` requires (at least) one fewer large allocations than `res=A/B`? – jez Jan 07 '21 at 03:01