Unless there's something wrong with your NumPy build or your OS (both of which are unlikely), this is almost certainly a memory error.
For example, let's say all these values are float64
. So, you've already allocated at least 18GB and 20GB for these two arrays, and now you're trying to allocate another 38GB for the concatenated array. But you only have, say, 64GB of RAM plus 2GB of swap. So, there's not enough room to allocate another 38GB. On some platforms, this allocation will just fail, which hopefully NumPy would just catch and raise a MemoryError
. On other platforms, the allocation may succeed, but as soon as you try to actually touch all of that memory you'll segfault (see overcommit handling in linux for an example). On other platforms, the system will try to auto-expand swap, but then if you're out of disk space it'll segfault.
Whatever the reason, if you can't fit X1
, X2
, and X
into memory at the same time, what can you do instead?
- Just build
X
in the first place, and fill X1
and X2
by filling sliced views of X
.
- Write
X1
and X2
out to disk, concatenate on disk, and read them back in.
- Send
X1
and X2
to a subprocess that reads them iteratively and builds X
and then continues the work.