Efficient way to perform a triangular sum with .sum over a matrix F(a[i], a[j]), given the vector a

Question

I have a vector a and need to perform a summation over two indices, something as

for i in (range, n): 
    for j in (i+1, n):
        F(a[i] - a[j])

where F is a function: the sum is reminding of summing over the upper triangle of an array.

I read the interesting thread on Fastest way in numpy to sum over upper triangular elements with the least memory and did my trials: indeed ARRAY.sum is a very fast way to sum over the elements of an upper triangular matrix.

To apply the method to my case, I first need to define an array such as

A[i,j] = F(a[i],a[j])

and then compute

(A.sum() - np.diag(A).sum())/2

I could define the array A via two for loops of course, but I wondering if there is a faster, numpy way.

In another case, the function F was simply equal to

F = a[i]*a[j]

and I could write

def sum_upper_triangular(vector):
    A = np.tensordot(vector,vector,0)
    return (A.sum() - np.diag(A).sum())/2

which is incredibly faster than summing directly with sum(), or nested for loops.

If F is more articulated, for example

np.exp(a[i] - a[j])

I would like to know what the most efficient way.

Thanks a lot

@Divakar, F = a[i]*a[j] * G( a[i]-a[j]), where G is the Mittag Leffler function taking as argument a[i] - a[j]. I guess already knowing how to deal with the example function I described in the post would be useful to me. As a matter of fact, I would already not be sure how to best deal with F = a[i] - a[j]. Thanks a lot — user37292, Dec 18 '19 at 09:01
This problem depends on the function G. It is very easy to implement a Cython or Numba solution for the summation problem. But to geed a good speedup you have to implement the function G also as a Cython cdef or with Numba or wrap a C implementation — max9111, Dec 19 '19 at 09:01
I appreciate the specific function G will affect the choice of the optimal approach. As i mentioned, I would already be content to learn how to handle the example i mention, np.exp(a[i] - a[j]), without using sum() — user37292, Dec 19 '19 at 21:35
Not certain about this, but it may help to store a[i] so you don't have to re-retrieve that value — Nathan, Jan 10 '20 at 23:07
Also, in your original ecample, it seems there's only 4 combinations you're checking (range, range+1), (n, n+1), (range, n), (n,n), is that correct? — Nathan, Jan 10 '20 at 23:10

Andreas K. · Answer 1 · 2020-01-12T16:29:06.210

If I understand it correctly you want to do the following:

result = []
n = len(a)
for i in range(n-1): 
    for j in range(i+1, n):
        result.append(F(a[i] - a[j]))

for some function F. Also, the operation between matrix elements can be any other (e.g. multiplication *). Below is one way to do it without the for-loops:

iu = np.triu_indices(n, k=1)
A_j, A_i = np.meshgrid(a, a)
res = F(A_i[iu] - A_j[iu])  # e.g. F = np.exp

Explanation (for n=5):

A_i = [[a[0], a[0], a[0], a[0], a[0],
       [a[1], a[1], a[1], a[1], a[1],
       [a[2], a[2], a[2], a[2], a[2],
       [a[3], a[3], a[3], a[3], a[3],
       [a[4], a[4], a[4], a[4], a[4]]

A_j = [[a[0], a[1], a[2], a[3], a[4],
       [a[0], a[1], a[2], a[3], a[4],
       [a[0], a[1], a[2], a[3], a[4],
       [a[0], a[1], a[2], a[3], a[4],
       [a[0], a[1], a[2], a[3], a[4]]

A_i[iu] = [a[0], a[0], a[0], a[0], a[1], a[1], a[1], a[2], a[2], a[3]]
A_j[iu] = [a[1], a[2], a[3], a[4], a[2], a[3], a[4], a[3], a[4], a[4]]

Then perform element-wise computations and apply element-wise F:

F(A_i[iu] - A_j[iu]) = [ 
    F(a[0] - a[1]), F(a[0] - a[2]), F(a[0] - a[3]), F(a[0] - a[4]),
    F(a[1] - a[2]), F(a[1] - a[3]), F(a[1] - a[4]),
    F(a[2] - a[3]), F(a[2] - a[4]),
    F(a[3] - a[4])]

thanks for your input. I did accept the other question simply because the method therein suggested turned out to be faster. Thanks again — user37292, Jan 19 '20 at 12:12

Daniel F · Accepted Answer · 2020-01-13T09:05:01.237

You could just use scipy.spatial.pdist and set whatever function you wanted to the metric. As a bonus, pdist only computes the off-diagonal triangle, so you don't need to remove it from your sum

from scipy.spatial.distance import pdist

def sum_upper_tri(arr, F = lambda x, y: x*y):
    return pdist(arr.reshape(arr.shape[0], -1), metric = F).sum()/2

If you want something super fast though, you're going to need numba:

from numba import jit

@jit
def sum_upper_tri_jit(arr, F = lambda x, y: x * y):
    out = 0
    for i in range(1, len(arr)):
        for j in range(i + 1, len(arr)):
            out += F(arr[i], arr[j])
    return out / 2

Haven't quite figured a way to @njit that yet, but if i can it'll be much faster.

In any case, a specially-built function each expected F will be much faster. For example, the exp(|x-y|) case (reminder that exp(x-y) is not symmetric: x-y != y-x)

from numba import njit

@njit
def sum_upper_tri_exp(arr):
    out = 0
    for i in range(1, len(arr)):
        for j in range(i + 1, len(arr)):
            out += np.exp(np.abs(arr[i] - arr[j]))
    return out / 2

This is about 100x faster than the above

if you don't want the summation, you can use:

from numba import njit

@njit
def sum_upper_tri_exp(arr):
    out = []
    for i in range(1, len(arr)):
        for j in range(i + 1, len(arr)):
            out += [np.exp(arr[i] - arr[j])]
    return out

You can jit the F function before calling it in the sum_upper_tri_exp function, or use a closure eg. https://stackoverflow.com/a/59256512/4045774 or https://stackoverflow.com/a/58752553/4045774 — max9111, Jan 13 '20 at 10:26
You're welcome to make an answer for the bounty @max9111, if I tried I'd just be cribbing your answers there. — Daniel F, Jan 13 '20 at 10:48
@-Daniel, thanks for your input. I have regretfully been unable to log over the past 3 days, I hope the bounty points have been awarded. That not being the case, please suggest if I have a way to accept your reply. Thanks again — user37292, Jan 19 '20 at 12:11

Efficient way to perform a triangular sum with .sum over a matrix F(a[i], a[j]), given the vector a

2 Answers2