0

Is there a way to find the median of each column in a ndarray. I tried the dumbest method using dual loop over each row to get the element column-wise and perform statstics.median on that and store it in a list.

But, as the dimensionality of the matrix grows, the time complexity will shoot up as well. Does Python have a better way to solve this?

arr = np.array([[1, 2, 3],[2,3,4],[3,4,5]])
print(arr)
array([[1, 2, 3],
       [2, 3, 4],
       [3, 4, 5]])

Expected output:

2,3,4
halfer
  • 19,824
  • 17
  • 99
  • 186
The Owl
  • 89
  • 7
  • 4
    since you are using numpy, isn't it just ```np.median(arr,axis=1)``` – StupidWolf Dec 10 '20 at 02:20
  • Thanks for the reply. I got one more question elated to np.median(). Say i have `arr = np.array([[1, 2, 3],[2,4,8],[3,4,5]]) arr array([[1, 2, 3], [2, 4, 8], [3, 4, 5]])` Then `np.median(arr,axis=0)` is returning `array([2., 4., 5.])` while the last element should have been 4. But when I use `statistics.median(arr[2])`, it indeed returns a 4, as expected. Is there a difefrence in the way np.median and statistics.median work? – The Owl Dec 10 '20 at 02:26
  • i don't get why ```np.median(np.array([[1, 2, 3],[2,4,8],[3,4,5]]) , axis = 0)``` should give ```2,4,4``` . you are taking column medians – StupidWolf Dec 10 '20 at 02:30
  • Initially, you said axis=1 so I thought it's for columns and 0 is for rows. But again, thanks a lot for providing info on np.median(). I now understand how it's working :) – The Owl Dec 10 '20 at 02:32
  • 1
    @TheOwl I think you are getting confused by the axis kwarg, maybe [this](https://stackoverflow.com/questions/17079279/how-is-axis-indexed-in-numpys-array) will help. – ssp Dec 10 '20 at 02:33
  • arh i see. sorry, i read too quickly, "using dual loop over each row", and thought you wanted the row median. so it is axis = 0 for column, axis = 1 for row – StupidWolf Dec 10 '20 at 02:35
  • Thank you guys for the documentation and explanation :) – The Owl Dec 10 '20 at 02:36

1 Answers1

0

Use numpy.median() across columns.

arr = np.array([[1, 2, 3],[2,3,4],[3,4,5]])
print(arr)
array([[1, 2, 3],
       [2, 3, 4],
       [3, 4, 5]])

np.median(arr,axis=1)
rikyeah
  • 1,896
  • 4
  • 11
  • 21