-2
[ 24, 131]
[ 24,  18]
[ 80,  89]

I need to create something like

    24, 18, 80, 89, 131
24[ 1,   1,  0,  0,  1]
18[ 0,   1,  0,  0,  0]
80[ 0,   0,  1,  1,  0]
89[ 0,   0,  0,  1,  0]
131[0,   0,  0,  0,  1]

Which is basically number of emails sent from ID 24 to ID 18 and so on.

  • 1
    As the answers in this question https://stackoverflow.com/questions/11106536/adding-row-column-headers-to-numpy-arrays suggest, you'll probably want to use a Pandas DataFrame if you want to have both named columns and rows to index by. However, from your question it's not clear what rule you are using to populate the array with ones and zeros. Can you expand on what you want? – Matt Pitkin Feb 09 '23 at 16:33
  • What I'm more confused about is why the index is `24, 18, 80, 89, 131`, but not `18, 24, 80, 89, 131` which is at least ascending. – HMH1013 Feb 09 '23 at 17:03
  • The matrix can be 18, 24, 80, 89, 131. If you have a look at the first row from question (which is [ 24, 131], this means that the user with ID 24 has sent an email to the user with ID 131. Hence in the answer matrix i want the value for row with ID 24 and column ID 131. In other scenario the value has to be 0. Also the value must be 0 if incase the row ID and column ID is same. – ahaan abrar Feb 09 '23 at 17:08

1 Answers1

1

If the indexes can be 18, 24, 80, 89, 131, then I have a solution:

import numpy as np
import pandas as pd 

array = np.array([[ 24, 131], [ 24,  18], [ 80,  89]])
arr_sort = np.unique(array)
result = np.diag(np.full(len(arr_sort),1)) 
index_str = list(map(str, arr_sort))

for _, value in enumerate(array):
    index = np.searchsorted(arr_sort, value)
    result[index[0], index[1]] = 1

df = pd.DataFrame(result, index=index_str, columns=index_str)

Which gives a result :

     18  24  80  89  131
18    1   0   0   0    0
24    1   1   0   0    1
80    0   0   1   1    0
89    0   0   0   1    0
131   0   0   0   0    1
HMH1013
  • 1,216
  • 2
  • 13