3

I know I can vectorize np.argmax via inputting a 2D array and specifying an axis, eg: np.argmax(2Darray,axis=1) to get the maximum index per row.

I know in the event two entries are equal in a single 1D vector whereby I wish to return the maximum index, I can tiebreak them via np.random.choice(np.flatnonzero(1Dvector == 1Dvector.max()))

The question is, how can I do both together? Ie: How to vectorize np.argmax whereby equal entries are randomly tiebroken?

user4779
  • 645
  • 5
  • 14
  • Did you check this: https://stackoverflow.com/questions/17568612/how-to-make-numpy-argmax-return-all-occurrences-of-the-maximum – marco romelli May 30 '19 at 07:13
  • Looks like that is only for a 1D vector. I can't find an axis argument for np.argwhere. Also I'd like to return the maximum as opposed to acquiring a list of max indexes, although I'm sure that part would be trivial if np.argwhere could be vectorized. – user4779 May 30 '19 at 07:31

1 Answers1

3

Here is one way. For large data one may consider replacing the permutation with something cheaper. I've hardcoded axis=1 but that shouldn't obscure the principle.

def fair_argmax_2D(a):
    y, x = np.where((a.T==a.max(1)).T)
    aux = np.random.permutation(len(y))
    xa = np.empty_like(x)
    xa[aux] = x
    return xa[np.maximum.reduceat(aux, np.where(np.diff(y, prepend=-1))[0])]

a = np.random.randint(0,5,(4,5))
a
# array([[2, 2, 2, 2, 1],
#        [3, 3, 3, 3, 2],
#        [3, 4, 2, 1, 4],
#        [3, 2, 4, 2, 1]])

# draw 10000 times
res = np.array([fair_argmax_2D(a) for _ in range(10000)])

# check
np.array([np.bincount(r, None, 5) for r in res.T])
# array([[ 2447,  2567,  2449,  2537,     0],
#        [ 2511,  2465,  2536,  2488,     0],
#        [    0,  5048,     0,     0,  4952],
#        [    0,     0, 10000,     0,     0]])
Paul Panzer
  • 51,835
  • 3
  • 54
  • 99