3

I wrote a small script to assign values to a numpy array by knowing their row and column coordinates:

gridarray = np.zeros([3,3])
gridarray_counts = np.zeros([3,3])

cols = np.random.random_integers(0,2,15)
rows = np.random.random_integers(0,2,15)
data = np.random.random_integers(0,9,15)

for nn in np.arange(len(data)):
    gridarray[rows[nn],cols[nn]] += data[nn]
    gridarray_counts[rows[nn],cols[nn]] += 1

In fact, then I know how many values are stored in the same grid cell and what the sum is of them. However, performing this on arrays of lengths 100000+ it is getting quite slow. Is there another way without using a for-loop?

Is an approach similar to this possible? I know this is not working yet.

gridarray[rows,cols] += data
gridarray_counts[rows,cols] += 1
HyperCube
  • 3,870
  • 9
  • 41
  • 53
  • Just to clarify to future readers, the seemingly simple solution which is stated not to work, indeed does not work, because `rows,cols` contains duplicate indices. See [this question](http://stackoverflow.com/questions/16034672/how-do-numpys-in-place-operations-e-g-work) for more details. – shx2 Apr 18 '13 at 19:23

2 Answers2

2

I would use bincount for this, but for now bincount only takes 1darrays so you'll need to write your own ndbincout, something like:

def ndbincount(x, weights=None, shape=None):
    if shape is None:
        shape = x.max(1) + 1

    x = np.ravel_multi_index(x, shape)
    out = np.bincount(x, weights, minlength=np.prod(shape))
    out.shape = shape
    return out

Then you can do:

gridarray = np.zeros([3,3])

cols = np.random.random_integers(0,2,15)
rows = np.random.random_integers(0,2,15)
data = np.random.random_integers(0,9,15)

x = np.vstack([rows, cols])
temp = ndbincount(x, data, gridarray.shape)
gridarray = gridarray + temp
gridarray_counts = ndbincount(x, shape=gridarray.shape)
Bi Rico
  • 25,283
  • 3
  • 52
  • 75
0

You can do this directly:

gridarray[(rows,cols)]+=data
gridarray_counts[(rows,cols)]+=1
Bitwise
  • 7,577
  • 6
  • 33
  • 50