How can i do a crosstab with weigths in another column?

Question

Hello I have the following code, in order to do a matrix of adjacency, but I have the weight by peer in another column, how can do the adjacency matrix in panda with this data?

Current code:

ady = pd.read_csv("edges.csv", sep=',')[['Source', 'Target', 'weight']]
ady['weight'] = pd.to_numeric(ady['weight'])
ady = pd.crosstab(ady.Source, ady.Target, ady.weight, aggfunc = sum)

Data:

Source, Target, weight
a,b,2
a,c,1
b,a,2
b,b,1
c,a,1

Expected data:

  a,b,c
a 0,2,1
b 2,1,0
c 1,0,0

dtypes:

ady.dtypes
Source     object
Target     object
weight    float64

Original data: https://pastebin.com/Y55a64yz

Any idea?

Thanks

Can you explain why your solution `pd.crosstab(ady.Source, ady.Target, ady.weight, aggfunc = sum)` not working? Because with sample data get expected output. — jezrael, Jun 09 '20 at 11:13
I think it is because of the spaces in your columns. You may be getting all the data in `Source` column itself, and NaN in others. Try `sep=','` in `read_csv` — Sayandip Dutta, Jun 09 '20 at 11:20
I think that error is another because the dtypes is float64, and happened the same with sep=','. — Tlaloc-ES, Jun 09 '20 at 13:11
Did you control `print(adi)`? I suspect that columns `Target` and `Weight` only contain NaN values... — Serge Ballesta, Jun 09 '20 at 13:31

score 0 · Answer 1 · answered Jun 09 '20 at 13:25

0

You could simply pivot the adi DataFrame:

adi.pivot(*adi.columns)

gives:

Target  a  b  c
Source         
a       0  2  1
b       2  1  0
c       1  0  0

answered Jun 09 '20 at 13:25

Serge Ballesta

143,923
11
122
252

Unfortunately problem is something else, so not dupe - https://stackoverflow.com/q/47152691/ – jezrael Jun 09 '20 at 13:27
this doesn't work, I triyed with ady=ady.pivot(*ady.columns) and without call to crosstab, and get NaN – Tlaloc-ES Jun 09 '20 at 13:27
@jezrael: No idea... But I prefere trying simple ways before wondering why more complex ones do not work. – Serge Ballesta Jun 09 '20 at 13:28
@SergeBallesta - ya, I test original solution and for me working :( – jezrael Jun 09 '20 at 13:28

score 0 · Answer 2 · answered Jun 09 '20 at 14:38

Finnaly I needed iterate over all peers in the data and set weight to 0.

ady = pd.read_csv("edges.csv", sep=',')[['Source', 'Target', 'weight']]
for i in ady['Source'].unique():
    for j in ady['Target'].unique():
        filter_a = ady['Source']==i
        filter_b = ady['Target']==j
        if ady[filter_a & filter_b]['weight'].shape == (0,):
            ady = ady.append({'Source':i, 'Target':j, 'weight':0 }, ignore_index=True)

How can i do a crosstab with weigths in another column?

2 Answers2