0

I have a code which I got from Numpy repeat for 2d array

Below one works fine with numpy array but throws ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() when using with cupy arrays. For the line ret_val[mask] = cp.repeat(arr.ravel(), rep.ravel()

I have tried to use logical operations already existing in cupy but they still throw errors.

def repeat2dvect(arr, rep):
    lens = cp.array(rep.sum(axis=-1))
    maxlen = lens.max()
    ret_val = cp.zeros((arr.shape[0], int(maxlen)))
    mask = (lens[:,None]>cp.arange(maxlen))
    ret_val[mask] = cp.repeat(arr.ravel(), rep.ravel())
    return ret_val
Blorgbeard
  • 101,031
  • 48
  • 228
  • 272
  • 1
    Can you provide actual sample inputs? In my trial the code failed at a different place: `lens[:,None]` (because `lens.shape` is always `()`), both in NumPy and CuPy. – niboshi May 28 '19 at 14:58
  • For arr ```[[1 2 3 4] [1 2 3 4] [1 2 3 4] [1 2 3 4] [1 2 3 4]]``` and for rep its also similar ```[[3 0 0 0] [1 1 0 0] [1 0 0 0] [2 2 0 0] [1 1 1 0]]``` – Harshal Chaudhari May 29 '19 at 08:32
  • related?: https://stackoverflow.com/questions/10062954/valueerror-the-truth-value-of-an-array-with-more-than-one-element-is-ambiguous – David Cary Sep 15 '20 at 20:42

1 Answers1

2

Conguratulations on the first contribution to the StackOverflow :)

I replicated the error with the following code:

import cupy as cp

arr = cp.array([5, 1, 4], 'float32')
rep = cp.array([3, 2], 'int32')
result = cp.repeat(arr, rep)
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

This error message is kind of misleading: the reason why this code doesn't work is something quite different.

In short, you can't pass cp.ndarray as the second argument of cp.repeat().

Why? That's because the shape of result is determined based on the value of that argument. If it were an ndarray, it would be problematic in CuPy (but not in NumPy) because the value of the array is in the GPU. In order to determine the output shape, CuPy would have to wait for the GPU to finish all of the queued computations, and then transfer the value to CPU. That would simply spoil the benefit of asynchronous computation. CuPy intentionally prohibits such operation.

In your particular case, you could for example manually convert rep to np.ndarray (by ret.get()) or calculate rep as np.ndarray from the beginning.

niboshi
  • 1,448
  • 3
  • 12
  • 20
  • converting the ```rep``` to ```np.ndarray``` does not help in my case, as the rep is coming from another function which uses cupy operations for obtaining speedup. Therefore, using numpy for ```rep``` would defeat the purpose of gaining GPU speedup. – Harshal Chaudhari May 29 '19 at 08:36
  • Also tried to cast the cupy array to numpy array by using ```cp.asnumpy```. Although, it results in another error ```ValueError: object __array__ method not producing an array``` – Harshal Chaudhari May 29 '19 at 08:39
  • It's in principle impossible to do that without synchronization. Shape of an array is stored as Python object (which means on CPU memory). If the information (=`rep`) that determines the shape is on GPU, it's inevitable to synchronize the GPU first and transfer the data to CPU. – niboshi May 29 '19 at 14:45
  • Yes, I figured. I got it working by using ```rep.get()``` and ```arr.get()``` and using numpy.repeat for that little piece of code. Could you explain what does ```.get()``` do exactly? I couldn't find any documentation on that in cupy docs. As I can see it returns np.ndarray but how does it differ from ```cp.asnumpy()```? – Harshal Chaudhari May 29 '19 at 16:01
  • Documentation says `cp.asnumpy()` can convert any object to `np.ndarray`. For example, `np.asnumpy(numpy_array)` is equivalent to `numpy_array` if it's already `np.ndarray`. On the other hand, `cp.ndarray.get()` always requires `cp.ndarray` to call it. – niboshi May 30 '19 at 17:45