Norm along row in pandas

Question

I have a pandas Dataframe with N columns representing the coordinates of a vector (for example X, Y, Z, but could be more than 3D).

I would like to aggregate the dataframe along the rows with an arbitrary function that combines the columns, for example the norm: (X^2 + Y^2 + Y^2).

I want to do something similar to what is done here and here and here but I want to keep it general enough that the number of columns can change and it behaves like

DataFrame.mean(axis = 1)

or

DataFrame.sum(axis = 1)

score 32 · Accepted Answer · answered Feb 05 '14 at 22:10

32

I found a faster solution than what @elyase suggested:

np.sqrt(np.square(df).sum(axis=1))

answered Feb 05 '14 at 22:10

Fra

4,918
7
33
50

there is also np.linalg.norm, but for some reason the "manual version" you supplied above is faster – Wizard Jan 10 '16 at 15:58
1

at least in my case, this could be speeded up by doing df.values – 00__00__00 Nov 24 '17 at 09:08
@Wizard The reason why the "manual version" is faster than `np.linalg.norm()` I've discussed in [this SO post](https://stackoverflow.com/questions/64948677/). Note that if views are involved or `df` has a lot of columns, `np.linalg.norm()` eventually wins. – normanius Apr 19 '21 at 02:12

ntg · Answer 2 · 2018-01-24T09:10:11.127

13

Numpy provides norm... Use:

np.linalg.norm(df[['X','Y','Z']].values,axis=1)

edited Jan 24 '18 at 09:10

answered Nov 22 '16 at 13:00

ntg

12,950
7
74
95

PeterFoster · Answer 3 · 2019-01-20T21:02:18.987

9

One line, using whatever function you desire (including lambda functions), e.g.

df.apply(np.linalg.norm, axis=1)

or

df.apply(lambda x: (x**2).sum()**.5, axis=1)

edited Jan 20 '19 at 21:02

answered Jan 17 '19 at 04:48

PeterFoster

319
1
3
9

mattexx · Answer 4 · 2013-08-30T02:53:27.227

3

filter the columns by name

cols = ['X','Y','Z']
df[cols].mean(axis=1)
df[cols].sum(axis=1)
df[cols].apply(lambda values: sum([v**2 for v in values]), axis=1)

edited Aug 30 '13 at 02:53

answered Aug 30 '13 at 02:48

mattexx

6,456
3
36
47

elyase · Answer 5 · 2013-08-30T02:58:24.610

2

You are looking for apply. Your example would look like this:

>> df = pd.DataFrame([[1, 1, 0], [1, 0, 0]], columns=['X', 'Y', 'Z'])
     X   Y   Z
0    1   1   0
1    1   0   0

>>> df.apply(lambda x: np.sqrt(x.dot(x)), axis=1)
0    1.414214
1    1.000000
dtype: float64

This works for any number of dimensions.

edited Aug 30 '13 at 02:58

answered Aug 30 '13 at 02:52

elyase

39,479
12
112
119

1

Thanks! I just stumbled upon a faster solution: `np.sqrt(np.square(df).sum(axis=1))` – Fra Feb 05 '14 at 22:11
Always prefer column-wise functions to `apply` - for common operations, the former are orders of magnitude faster than a hand-written apply. – Axel May 31 '18 at 07:02

Norm along row in pandas

5 Answers5