1

Consider

import numpy as np
a = np.random.randn(3)
b = np.random.randn(3)
c = np.vstack([a, b])  # this always create a copy of a and b I think

Is there a fast way of assembling some c that is just pointers to the space by a and b?

Basically I have some situation where I can do a fairly cheap first pass to determine the total number of vectors I will have or I can just store a list/dict of vectors and assemble on the fly for something. If there was a cheap way of doing that I would save the first pass.

The answer is no it is not possible. And there is a similar (duplicate) question here Concatenate Numpy arrays without copying .

mathtick
  • 6,487
  • 13
  • 56
  • 101
  • 3
    `a` and `b` have their own data-buffers. You cannot make a new array `c` that uses them both. It has to have its own, hence a copy. You can make a list `[a,b]` that contains references to the 2 arrays, and thus no copy. Usually we recommend collecting all arrays in a list, and doing one `vstack` at the end. – hpaulj Mar 03 '22 at 19:46
  • 2
    Does this help? [Concatenate Numpy arrays without copying](https://stackoverflow.com/questions/7869095/concatenate-numpy-arrays-without-copying). Please decide for yourself whether this counts as a duplicate. Making a copy seems to be the sensible solution. – Michael Szczesny Mar 03 '22 at 20:17
  • 1
    One trick is to preallocate a big buffer and reserve some space in it for several sub-views. This require the total space to be bounded. Note that such micro-optimization is often not a good idea unless you use Numba as the overhead of Numpy functions is likely much bigger than the one of memory allocation for small arrays. As for big arrays, Numpy tends to be inefficient due to implicit temporary copies and type conversions. – Jérôme Richard Mar 03 '22 at 20:36
  • Yes, seems like a) the answer is no it's not possible and b) it's duplicate with the other question. – mathtick Mar 04 '22 at 08:28

0 Answers0