6

I have the following dataframe.

df = pd.DataFrame([['a', 4], ['b', 1], ['c', 2], ['d', 0], ], columns=['item', 'value'])
df
item | value    
a    | 4
b    | 1
c    | 2
d    | 0 

I want to calculate the pairwise absolute difference between each possible pair of item to give the following output.

item| a     | b     | c     | d
a   | 0.0   | 3.0   | 2.0   | 4.0
b   | 3.0   | 0.0   | 1.0   | 1.0
c   | 2.0   | 1.0   | 0.0   | 2.0
d   | 4.0   | 1.0   | 2.0   | 0.0

After a lot of search, I could find answer only to direct element by element difference, which results in a single column output.

So far, I've tried

pd.pivot_table(df, values='value', index='item', columns='item', aggfunc=np.diff)

but this doesn't work.

Thirupathi Thangavel
  • 2,418
  • 3
  • 29
  • 49

2 Answers2

5

This question has been answered here. The only difference is that you would need to add abs:

abs(df['value'].values - df['value'].values[:, None])
Harm
  • 590
  • 3
  • 21
2

Not exactly the same output but taking a cue from here: https://stackoverflow.com/a/9704775/2064141

You can try this:

np.abs(np.array(df['value'])[:,np.newaxis] - np.array(df['value']))

Which gives:

array([[0, 3, 2, 4],
       [3, 0, 1, 1],
       [2, 1, 0, 2],
       [4, 1, 2, 0]])

Although I just saw the link from Harm te Molder and it seems to be more relevant for your use.

ceeeeej
  • 31
  • 4