Pandas column pairwise difference for each possible pair

Question

I have the following dataframe.

df = pd.DataFrame([['a', 4], ['b', 1], ['c', 2], ['d', 0], ], columns=['item', 'value'])
df
item | value    
a    | 4
b    | 1
c    | 2
d    | 0

I want to calculate the pairwise absolute difference between each possible pair of item to give the following output.

item| a     | b     | c     | d
a   | 0.0   | 3.0   | 2.0   | 4.0
b   | 3.0   | 0.0   | 1.0   | 1.0
c   | 2.0   | 1.0   | 0.0   | 2.0
d   | 4.0   | 1.0   | 2.0   | 0.0

After a lot of search, I could find answer only to direct element by element difference, which results in a single column output.

So far, I've tried

pd.pivot_table(df, values='value', index='item', columns='item', aggfunc=np.diff)

but this doesn't work.

score 5 · Accepted Answer · answered Feb 05 '19 at 08:11

5

This question has been answered here. The only difference is that you would need to add abs:

abs(df['value'].values - df['value'].values[:, None])

answered Feb 05 '19 at 08:11

Harm

590
3
21

ceeeeej · Answer 2 · 2019-03-04T07:02:10.553

2

Not exactly the same output but taking a cue from here: https://stackoverflow.com/a/9704775/2064141

You can try this:

np.abs(np.array(df['value'])[:,np.newaxis] - np.array(df['value']))

Which gives:

array([[0, 3, 2, 4],
       [3, 0, 1, 1],
       [2, 1, 0, 2],
       [4, 1, 2, 0]])

Although I just saw the link from Harm te Molder and it seems to be more relevant for your use.

edited Mar 04 '19 at 07:02

answered Feb 05 '19 at 08:17

ceeeeej

31
4

Pandas column pairwise difference for each possible pair

2 Answers2