0

I have a numpy array which contains vectorised data. I need to compare each of these vectors (a row in the array) euclidean distances to itself and every other row.

The vectors are of the form

[[0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 ...
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]]

I know I need two loops, here is what I have so far

def euclidean_distance_loop(termdoc):
    i = 0
    j = 0
    matrix = np.array([])
    while( j < (len(termdoc-1))):
        matrix = np.append(matrix,[euclidean_distance(termdoc[i],termdoc[j])])
        j = j + 1
        
    return np.array([matrix])

euclidean_distance_loop(termdoc)

I know this is an index problem and I need another index or an incremented index in another loop but not sure how to construct it

S.Gou
  • 37
  • 6
  • Can you provide a sample of your data and desired outcome? It'd be easier to understand the question – theProcrastinator May 21 '21 at 06:08
  • The data is a term by document matrix where each row is a vector. The function is to take each row and compare row[0] to itself and all other rows, then take row[1] compare to all rows and itself, row[2] and all other rows and itself. My problem is I don't know how to update an index for the outer and inner loop I think – S.Gou May 21 '21 at 06:18
  • 1
    Take a look at this wiki by Divakar: [https://github.com/droyed/eucl_dist/wiki/Main-Article#prospective-method](https://github.com/droyed/eucl_dist/wiki/Main-Article#prospective-method) – Lith May 21 '21 at 06:18

1 Answers1

2

You don’t need loops.

def self_distance(x):
    return np.linalg.norm(x[:,np.newaxis] - x, axis=-1)

See also:

Jun
  • 432
  • 4
  • 8