How to create big vector with random numbers using numpy and MPI

Question

I want to create a vector with sitze 10^15 with numpy and MPI.

I have no clue how to start. I tried to use

v = np.random.rand(10**15,1)

but it exceeds my memory.

Indeed you can't. However you can handle relatively large (10s of GB) arrays with Dask https://examples.dask.org/array.html#Create-Random-array — Ardweaden, May 05 '20 at 12:50

score 3 · Accepted Answer · answered May 05 '20 at 12:21

3

You can't. This is thousands terabytes of data, too much even for storing on disk. Consider redesigning your program to reduce amount of data or processing data iteratively.

answered May 05 '20 at 12:21

AlexM4

503
5
12

Christian Eslabon · Answer 2 · 2020-06-20T06:25:36.497

0

Dask arrays coordinate many Numpy arrays, arranged into chunks within a grid. They support a large subset of the Numpy API.

Call .compute() when you want your result as a NumPy array.

%%time
import dask.array as da
answer_sum = da.random.random((10**15)).sum().compute()
answer_mean = da.random.random((10**15)).mean().compute()
print(answer_sum)
print(answer_mean )

Source: https://examples.dask.org/array.html#Create-Random-array

edited Jun 20 '20 at 06:25

answered Jun 20 '20 at 02:52

Christian Eslabon

685
4
8

1

While this code may solve the question, [including an explanation](//meta.stackexchange.com/q/114762) of how and why this solves the problem would really help to improve the quality of your post, and probably result in more up-votes. Remember that you are answering the question for readers in the future, not just the person asking now. Please [edit] your answer to add explanations and give an indication of what limitations and assumptions apply. – rizerphe Jun 20 '20 at 04:25

How to create big vector with random numbers using numpy and MPI

2 Answers2