Suppose there is an array with outcomes and an array with probabilities. It can be the case that some outcomes are listed multiple times. For example:
import numpy as np
x = np.array(([0,0],[1,1],[2,1],[1,1],[2,2]),dtype=int)
p = np.array([0.1,0.2,0.3,0.1,0.2],dtype=float)
Now I would like to list the unique outcomes in x
and add up the corresponding probabilities in p
of the duplicate outcomes. So the result should be arrays xnew
and pnew
defined as
xnew = np.array(([0,0],[1,1],[2,1],[2,2]),dtype=int)
pnew = np.array([0.1,0.3,0.3,0.2],dtype=float)
While there are some examples of how to obtain unique rows, see, e.g. Removing duplicate columns and rows from a NumPy 2D array , it is unclear to me how to use this to add up values in the other array.
Anyone have a suggestion? Solutions using numpy are preferred.