3

I'd like to find the index of last non-zero element in pandas series. I can do it with a loop:

ilast = 0
for i in mySeries.index:
    if abs(mySeries[i]) > 0:
        ilast = i

Is there a cleaner & shorter way of doing it?

sashkello
  • 17,306
  • 24
  • 81
  • 109
  • I have no idea about the pandas series. What is the type for the series? And could you please give an example? As far as I think, you could travel from the end to begin, and return when meet with a non-zero. – Sheng Feb 24 '14 at 22:54

2 Answers2

9

I might just write s[s != 0].index[-1], e.g.

>>> s = pd.Series([0,1,2,3,0,4,0],index=range(7,14))
>>> s
7     0
8     1
9     2
10    3
11    0
12    4
13    0
dtype: int64
>>> s[s != 0].index[-1]
12

Originally I thought using nonzero would make things simpler, but the best I could come up with was

>>> s.index[s.nonzero()[0][-1]]
12

which is a lot faster (30+ times faster) for this example but I don't like the look of it.. YMMV.

DSM
  • 342,061
  • 65
  • 592
  • 494
1

Just came up with a few solutions.

A couple of ways it with generator:

max(i for i in s.index if s[i] != 0) # will work only if index is sorted

and

next(i for i in s.index[::-1] if s[i] != 0)

which is quite readable and also relatively quick.

Through numpy's trip_zeros:

import numpy as np
np.trim_zeros(s, 'b').index[-1]

which is slower than both of the @DSM answers.


Summary:

timeit np.trim_zeros(s, 'b').index[-1]
10000 loops, best of 3: 89.9 us per loop

timeit s[s != 0].index[-1]
10000 loops, best of 3: 68.5 us per loop

timeit next(i for i in s.index[::-1] if s[i] != 0)
10000 loops, best of 3: 19.4 us per loop

timeit max(i for i in s.index if s[i] != 0)
10000 loops, best of 3: 16.8 us per loop

timeit s.index[s.nonzero()[0][-1]]
100000 loops, best of 3: 1.94 us per loop
sashkello
  • 17,306
  • 24
  • 81
  • 109