I'm trying to build a function that returns the products of subsets of array elements. Basically I want to build a prod_by_group
function that does this:
values = np.array([1, 2, 3, 4, 5, 6])
groups = np.array([1, 1, 1, 2, 3, 3])
Vprods = prod_by_group(values, groups)
And the resulting Vprods
should be:
Vprods
array([6, 4, 30])
There's a great answer here for sums of elements that I think it should be similar to: https://stackoverflow.com/a/4387453/1085691
I tried taking the log
first, then sum_by_group
, then exp
, but ran into numerical issues.
There are some other similar answers here for min and max of elements by group: https://stackoverflow.com/a/8623168/1085691
Edit: Thanks for the quick answers! I'm trying them out. I should add that I want it to be as fast as possible (that's the reason I'm trying to get it in numpy in some vectorized way, like the examples I gave).
Edit: I evaluated all the answers given so far, and the best one is given by @seberg below. Here's the full function that I ended up using:
def prod_by_group(values, groups):
order = np.argsort(groups)
groups = groups[order]
values = values[order]
group_changes = np.concatenate(([0], np.where(groups[:-1] != groups[1:])[0] + 1))
return np.multiply.reduceat(values, group_changes)